Vegan-specific signature implies healthier metabolic profile: findings from diet-related multi-omics observational study based on different European populations

Statistical report for microbiome analysis (SGB level, prevalence > 30% in training dataset)


Authors and affiliations

Monika Cahova1,*, Anna Ouradova2,*, Giulio Ferrero3,4,*, Miriam Bratova1, Nikola Daskova1, Klara Dohnalova5, Marie Heczkova1, Karel Chalupsky5, Maria Kralova6,7, Marek Kuzma8, Filip Tichanek1, Lucie Najmanova8, Barbara Pardini10, Helena Pelantová8, Radislav Sedlacek5, Sonia Tarallo9, Petra Videnska10, Jan Gojda2,#, Alessio Naccarati9,#


* These authors have contributed equally to this work and share first authorship
# These authors have contributed equally to this work and share last authorship

1 Institute for Clinical and Experimental Medicine, Prague, Czech Republic
2 Department of Internal Medicine, Kralovske Vinohrady University Hospital and Third Faculty of Medicine, Charles University, Prague, Czech Republic 3 Department of Clinical and Biological Sciences, University of Turin, Turin, Italy
4 Department of Computer Science, University of Turin, Turin, Italy
5 Czech Centre for Phenogenomics, Institute of Molecular Genetics of the Czech Academy of Sciences, Prague, Czech Republic
6 Ambis University, Department of Economics and Management, Prague, Czech Republic
7 Department of Applied Mathematics and Computer Science, Masaryk University, Brno, Czech Republic
8 Institute of Microbiology of the Czech Academy of Sciences, Prague, Czech Republic
9 Italian Institute for Genomic Medicine (IIGM), c/o IRCCS Candiolo, Turin, Italy
10 Mendel University, Department of Chemistry and Biochemistry, Brno, Czech Republic


This is a statistical report of the study Vegan-specific signature implies healthier metabolic profile: findings from diet-related multi-omics observational study based on different European populations that has been submitted to [TO BE ADDED]

When using this code or data, cite the original publication:

TO BE ADDED

BibTex citation for the original publication:

TO BE ADDED


Original GitHub repository: https://github.com/filip-tichanek/ItCzVegans

Statistical reports can be found on the reports hub.

Data analysis is described in detail in the statistical methods report.


1 Introduction

This project explores potential signatures of a vegan diet across the microbiome, metabolome, and lipidome. We used data from healthy vegan and omnivorous human subjects in two countries (Czech Republic and Italy), with subjects grouped by Country and Diet, resulting in four distinct groups.

To assess the generalizability of these findings, we validated our results with an independent cohort from the Czech Republic for external validation.

1.1 Statistical Methods

The statistical modeling approach is described in detail in this report. Briefly, the methods used included:

  • Multivariate analysis: We conducted multivariate analyses (PERMANOVA, PCA, correlation analyses) to explore the effects of diet, country, and their possible interaction (diet : country) on the microbiome, lipidome, and metabolome compositions in an integrative manner. This part of the analysis is not available on the GitHub page, but the code will be provided upon request.

  • Linear models: Linear models were applied to estimate the effects of diet, country, and their interaction (diet:country) on individual lipids, metabolites, bacterial taxa and pathways (“features”). Features that significantly differed between diet groups (based on the estimated average effect of diet across both countries, adjusted for multiple comparisons with FDR < 0.05) were further examined in the independent validation cohort to assess whether these associations were reproducible.

  • Predictive models (elastic net): We employed elastic net (regularized) logistic regression to predict vegan status based on metabolome, lipidome, microbiome and pathways (one predictive model per dataset, i.e.,four elastic net models in total). These models were internally validated using out-of-bag bootstrap resampling. The discriminatory power of each model to differentiate between diet groups was evaluated using the out-of-sample (optimism-corrected) area under the receiver operating characteristic curve (ROC-AUC). The models trained on the training data were then used to estimate the predicted probability that a given subject is vegan in an indepedent validation cohort. This predicted probability was subsequently used as a variable to discriminate between diet groups for external validation.

2 Initiation

2.1 Set home directory

Open code
setwd('/home/ticf/GitRepo/ticf/478_MOCA_italian')

2.2 Upload initiation file

Open code
source('478_initiation.R')

3 Data

3.1 Upload all original data

3.1.1 Training set

3.1.1.1 Connect metadata from lipidom table

Open code
training_metadata <- read.xlsx('gitignore/data/lipidome_training_cohort.xlsx') %>% 
  select(Sample, Country, Diet) %>% 
  mutate(ID = Sample)

3.1.1.2 Connect training data

Open code
data_microbiome_original_raw  <- read.table(
  'gitignore/data/0_Data_Metaphlan4_SGB_subset.txt') 

colnames(data_microbiome_original_raw) <- data_microbiome_original_raw[1,]

data_microbiome_original_raw <- data_microbiome_original_raw %>% 
  t() %>% 
  data.frame()

colnames(data_microbiome_original_raw) <- data_microbiome_original_raw[1,]

data_microbiome_original_raw <- data_microbiome_original_raw[-1, ] %>% 
  left_join(training_metadata, by = 'ID') %>% 
  mutate(Data = 'valid', Sample = ID) %>% 
  select(Sample, Data, Diet, Country, everything()) %>% 
  select(-ID)

dim(data_microbiome_original_raw)
## [1] 166 684

3.1.2 Validation set

3.1.2.1 Get metadata from lipidom table

Open code
data_lipids_validation <- read.xlsx('gitignore/data/lipidome_validation_cohort.xlsx') %>% 
  mutate(ID = X1) %>% 
  select(ID, X2)

3.1.2.2 Connect validation data

Open code
data_microbiome_validation_raw <- read.table(
  'gitignore/data/0_Data_Metaphlan4_SGB_subset_validation.txt') 

colnames(data_microbiome_validation_raw) <- data_microbiome_validation_raw[1,]

data_microbiome_validation_raw <- data_microbiome_validation_raw %>% 
  t() %>% 
  data.frame()

colnames(data_microbiome_validation_raw) <- data_microbiome_validation_raw[1,]

data_microbiome_validation_raw <- data_microbiome_validation_raw[-1, ] %>% 
  mutate(
    ID= paste0("K", gsub("\\..*", "", trimws(ID)))) %>% 
  left_join(data_lipids_validation, by = 'ID') %>% 
  mutate(Data = 'valid', Sample = ID, Diet = X2) %>% 
  select(Sample, Data, Diet, everything()) %>% 
  select(-ID, -X2)

dim(data_microbiome_validation_raw)
## [1] 103 683

3.1.3 Get center-log transformed value

Open code
set.seed(478)

## Training data
metadata <- data_microbiome_original_raw[, c("Sample", "Country", "Diet")]

bacteria_d <- data_microbiome_original_raw[, -c(1:4)] %>% 
  mutate(across(everything(), as.numeric)) %>% 
  select(where(~ mean(. != 0) >= 0.3)) 

rel_taxons <- c(colnames(bacteria_d))

bacteria_data <- bacteria_d / rowSums(bacteria_d)
dim(bacteria_data)
## [1] 166 299
    
bacteria_data <- lrSVD(bacteria_data, 
                        label = 0, 
                        dl = NULL,
                        z.warning = 0.9,
                        z.delete = FALSE,
                       ncp = 1)

clr_bacteria_data <- clr(bacteria_data)
data_microbiome_original <- cbind(metadata, clr_bacteria_data)

if(file.exists('gitignore/data_microbiome_SGB30_training_impCLR.csv') == FALSE){
  write.csv(data_microbiome_original ,
            'gitignore/data_microbiome_SGB30_training_impCLR.csv')
}

## Show variances of CLR proportions across samples
data_variance <- data_microbiome_original %>%
  rowwise() %>%   
  mutate(variance = var(c_across(-(Sample:Diet)))) %>%  
  ungroup() %>%       
  select(Sample, variance)  

## Look at distribution
hist(data_variance$variance)

Open code

## Show extreme samples

## Validation data
metadata <- data_microbiome_validation_raw[, c("Sample", "Data", "Diet")]
bacteria_d <- data_microbiome_validation_raw[, -(1:3)] %>% 
  mutate(across(everything(), as.numeric)) %>% 
  select(all_of(rel_taxons))

bacteria_data <- bacteria_d / rowSums(bacteria_d)
min(colSums(bacteria_data))
## [1] 0.0004796078
    
bacteria_data <- lrSVD(bacteria_data, 
                        label = 0, 
                        dl = NULL,
                        z.warning = 0.9, 
                        z.delete = FALSE,
                       ncp = 1)



clr_bacteria_data <- clr(bacteria_data)
data_microbiome_validation <- cbind(metadata, clr_bacteria_data)

## Add Diet for K284 which has the diet missing
data_microbiome_validation[
  which(data_microbiome_validation$Sample == 'K284'), 'Diet'
  ] <- 'VEGAN'

data_microbiome_validation$`Wujia_chipingensis|SGB5111`
##   [1] -2.2659270 -1.8366784 -1.5719579 -1.5649595 -1.3519655 -1.9273808
##   [7] -0.5872619 -1.9902827 -0.9982257 -1.6966210 -2.1930395 -0.1319288
##  [13] -2.1260143 -1.9245373 -2.0088636  0.5526297  3.9926563 -0.7907929
##  [19] -2.1891711 -1.9395276 -2.1353811 -2.1654279 -2.2360908 -0.3930113
##  [25] -2.3957911 -0.2977329 -1.4533578 -1.4072936 -2.0794670  1.8049062
##  [31]  1.5258843  3.6617309  3.8680214 -2.0412347  0.4626366  0.3866172
##  [37]  5.8044209 -2.2201075 -0.4444399 -2.1377634 -2.1399921 -2.0933531
##  [43]  4.5931692 -1.6871195  2.8164681 -2.5644101 -2.0585439 -2.6629664
##  [49]  0.6723405  4.4534434  4.1610574  5.3832639 -1.6933290 -2.5382040
##  [55] -2.4107280 -2.1198776  5.3413842 -1.8967301 -1.5487724 -2.1396917
##  [61] -1.7628668  4.1335815  5.0872100  3.6031291 -1.8040730  4.9799740
##  [67] -0.9850500 -1.5123666 -0.5770878 -1.7964678 -2.1354643  2.3798401
##  [73]  5.9776544 -2.2009657 -2.4395272  4.0615507 -1.3941153 -2.3882306
##  [79] -2.2799115  4.6054080 -1.5516203 -0.4367249 -1.5615449  4.0790976
##  [85]  1.0917225  3.7139508 -1.8139512 -1.1749955 -1.6831395 -1.5614355
##  [91] -1.9565894 -0.1425388 -1.0454574 -2.1128182  4.9182020 -1.6631032
##  [97] -2.3185501 -1.6734295 -1.6233100 -1.5638780  3.3149668  4.7828809
## [103] -1.5505336

# data_microbiome_validation <- data_microbiome_validation %>% 
#   rename("Wujia_chipingenss|SGB5111" = "Wujia_chipingensis|SGB5111")


if(file.exists('gitignore/data_microbiome_SGB30_validation_impCLR.csv') == FALSE){
  write.csv(data_microbiome_validation,
            'gitignore/data_microbiome_SGB30_validation_impCLR.csv')
}

3.1.4 Merge training and validation dataset

Open code

cbind(
  colnames(data_microbiome_original), 
  colnames(data_microbiome_validation))
##        [,1]                                                
##   [1,] "Sample"                                            
##   [2,] "Country"                                           
##   [3,] "Diet"                                              
##   [4,] "Bacteroides_stercoris|SGB1830"                     
##   [5,] "Alistipes_putredinis|SGB2318"                      
##   [6,] "Candidatus_Cibionibacter_quicibialis|SGB15286"     
##   [7,] "Bacteroides_uniformis|SGB1836"                     
##   [8,] "Eubacterium_siraeum|SGB4198"                       
##   [9,] "GGB9602_SGB15031|SGB15031"                         
##  [10,] "Phocaeicola_vulgatus|SGB1814"                      
##  [11,] "Faecalibacterium_prausnitzii|SGB15316"             
##  [12,] "Lachnospiraceae_bacterium_CLA_AA_H244|SGB4993"     
##  [13,] "Sutterella_wadsworthensis|SGB9283"                 
##  [14,] "Oscillibacter_sp_ER4|SGB15254"                     
##  [15,] "Parabacteroides_distasonis|SGB1934"                
##  [16,] "Parabacteroides_merdae|SGB1949"                    
##  [17,] "Oscillibacter_valericigenes|SGB15053"              
##  [18,] "GGB47687_SGB2286|SGB2286"                          
##  [19,] "Brotolimicola_acetigignens|SGB4914"                
##  [20,] "GGB9747_SGB15356|SGB15356"                         
##  [21,] "Faecalibacterium_prausnitzii|SGB15339"             
##  [22,] "Alistipes_shahii|SGB2295"                          
##  [23,] "Alistipes_onderdonkii|SGB2303"                     
##  [24,] "Faecalibacterium_SGB15346|SGB15346"                
##  [25,] "Dialister_invisus|SGB5825_group"                   
##  [26,] "Faecalibacterium_prausnitzii|SGB15332"             
##  [27,] "Gemmiger_formicilis|SGB15300"                      
##  [28,] "GGB33469_SGB15236|SGB15236"                        
##  [29,] "Bacteroides_eggerthii|SGB1829"                     
##  [30,] "Coprococcus_eutactus|SGB5117"                      
##  [31,] "Alistipes_communis|SGB2290"                        
##  [32,] "Faecalibacterium_prausnitzii|SGB15322"             
##  [33,] "Escherichia_coli|SGB10068"                         
##  [34,] "Faecalibacterium_prausnitzii|SGB15318"             
##  [35,] "GGB3175_SGB4191|SGB4191"                           
##  [36,] "Faecalibacterium_prausnitzii|SGB15342"             
##  [37,] "Fusicatenibacter_saccharivorans|SGB4874"           
##  [38,] "Intestinimonas_massiliensis|SGB15127"              
##  [39,] "Vescimonas_coprocola|SGB15089"                     
##  [40,] "Oscillibacter_sp_MSJ_31|SGB15249"                  
##  [41,] "Bacteroides_caccae|SGB1877"                        
##  [42,] "Oscillospiraceae_bacterium_CLA_AA_H250|SGB14861"   
##  [43,] "Bifidobacterium_longum|SGB17248"                   
##  [44,] "Alistipes_senegalensis|SGB2296"                    
##  [45,] "Roseburia_intestinalis|SGB4951"                    
##  [46,] "Dysosmobacter_welbionis|SGB15078"                  
##  [47,] "Eubacterium_rectale|SGB4933"                       
##  [48,] "Alistipes_ihumii|SGB2328"                          
##  [49,] "Roseburia_faecis|SGB4925"                          
##  [50,] "GGB9699_SGB15216|SGB15216"                         
##  [51,] "Oscillospiraceae_bacterium|SGB15225"               
##  [52,] "Faecalibacterium_prausnitzii|SGB15317"             
##  [53,] "GGB9715_SGB15267|SGB15267"                         
##  [54,] "GGB9667_SGB15164|SGB15164"                         
##  [55,] "Lachnospira_pectinoschiza|SGB5075"                 
##  [56,] "Bacteroides_clarus|SGB1832"                        
##  [57,] "Anaerotignum_faecicola|SGB5190"                    
##  [58,] "Hydrogenoanaerobacterium_saccharovorans|SGB15350"  
##  [59,] "Bacteroides_faecis|SGB1860"                        
##  [60,] "Roseburia_hominis|SGB4936"                         
##  [61,] "GGB9712_SGB15244|SGB15244"                         
##  [62,] "Ruminococcus_bicirculans|SGB4262"                  
##  [63,] "Bilophila_wadsworthia|SGB15452"                    
##  [64,] "Odoribacter_splanchnicus|SGB1790"                  
##  [65,] "GGB9365_SGB14341|SGB14341"                         
##  [66,] "GGB9715_SGB15265|SGB15265"                         
##  [67,] "GGB9730_SGB15291|SGB15291"                         
##  [68,] "Bifidobacterium_adolescentis|SGB17244"             
##  [69,] "Bacteroides_ovatus|SGB1871"                        
##  [70,] "GGB9614_SGB15049|SGB15049"                         
##  [71,] "Simiaoa_sunii|SGB4910"                             
##  [72,] "GGB9759_SGB15370|SGB15370"                         
##  [73,] "Barnesiella_intestinihominis|SGB1965"              
##  [74,] "Agathobaculum_butyriciproducens|SGB14993"          
##  [75,] "Lachnospira_eligens|SGB5082"                       
##  [76,] "Lacrimispora_amygdalina|SGB4716"                   
##  [77,] "Akkermansia_muciniphila|SGB9226"                   
##  [78,] "Clostridiales_bacterium|SGB15143"                  
##  [79,] "GGB6649_SGB9391|SGB9391"                           
##  [80,] "Ruminococcus_bromii|SGB4285"                       
##  [81,] "Gemmiger_formicilis|SGB15299"                      
##  [82,] "Alistipes_sp_AF17_16|SGB2326"                      
##  [83,] "Clostridiaceae_bacterium|SGB4269"                  
##  [84,] "GGB9760_SGB15373|SGB15373"                         
##  [85,] "Clostridiaceae_bacterium_AF18_31LB|SGB4767"        
##  [86,] "GGB9621_SGB15073|SGB15073"                         
##  [87,] "Bacteroides_cellulosilyticus|SGB1844"              
##  [88,] "Faecalibacterium_sp_CLA_AA_H233|SGB15315"          
##  [89,] "Ruthenibacterium_lactatiformans|SGB15271"          
##  [90,] "Lawsonibacter_asaccharolyticus|SGB15154"           
##  [91,] "Clostridium_sp_AF20_17LB|SGB4714"                  
##  [92,] "Phocaeicola_massiliensis|SGB1812"                  
##  [93,] "GGB9760_SGB15374|SGB15374"                         
##  [94,] "Clostridium_fessum|SGB4705"                        
##  [95,] "GGB33586_SGB53517|SGB53517"                        
##  [96,] "Clostridium_sp_AF36_4|SGB4644"                     
##  [97,] "Clostridium_SGB4909|SGB4909"                       
##  [98,] "GGB13404_SGB14252|SGB14252"                        
##  [99,] "GGB9627_SGB15081|SGB15081"                         
## [100,] "GGB9608_SGB15041|SGB15041"                         
## [101,] "GGB9707_SGB15229|SGB15229"                         
## [102,] "Parasutterella_excrementihominis|SGB9262"          
## [103,] "Roseburia_inulinivorans|SGB4940"                   
## [104,] "Phocaeicola_dorei|SGB1815"                         
## [105,] "Oscillibacter_valericigenes|SGB15124"              
## [106,] "GGB9502_SGB14899|SGB14899"                         
## [107,] "GGB9818_SGB15459|SGB15459"                         
## [108,] "GGB9767_SGB15385|SGB15385"                         
## [109,] "Clostridiaceae_bacterium|SGB4770"                  
## [110,] "Bacteroides_thetaiotaomicron|SGB1861"              
## [111,] "GGB9559_SGB14969|SGB14969"                         
## [112,] "Flavonifractor_plautii|SGB15132"                   
## [113,] "Fusicatenibacter_sp_CLA_AA_H277|SGB4780"           
## [114,] "Lachnospiraceae_bacterium_AM48_27BH|SGB4706"       
## [115,] "GGB9063_SGB13982|SGB13982"                         
## [116,] "Collinsella_aerofaciens|SGB14535"                  
## [117,] "GGB9522_SGB14921|SGB14921"                         
## [118,] "GGB9619_SGB15067|SGB15067"                         
## [119,] "Clostridium_leptum|SGB14853"                       
## [120,] "GGB52130_SGB14966|SGB14966"                        
## [121,] "Clostridium_sp_AF27_2AA|SGB4712"                   
## [122,] "Blautia_sp_MCC283|SGB4828"                         
## [123,] "Clostridiales_bacterium_KLE1615|SGB5090"           
## [124,] "Lachnospira_sp_NSJ_43|SGB5087"                     
## [125,] "Butyricimonas_virosa|SGB1784"                      
## [126,] "GGB9509_SGB14906|SGB14906"                         
## [127,] "Coprococcus_catus|SGB4670"                         
## [128,] "GGB4585_SGB6340|SGB6340"                           
## [129,] "Phascolarctobacterium_faecium|SGB5792"             
## [130,] "Clostridiaceae_bacterium_Marseille_Q4143|SGB4768"  
## [131,] "GGB45432_SGB63101|SGB63101"                        
## [132,] "Clostridium_sp_AM33_3|SGB4711"                     
## [133,] "GGB9342_SGB14306|SGB14306"                         
## [134,] "Faecalicatena_fissicatena|SGB4871"                 
## [135,] "Alistipes_finegoldii|SGB2301"                      
## [136,] "Oscillospiraceae_bacterium_Marseille_Q3528|SGB4778"
## [137,] "Faecalibacterium_sp_HTFF|SGB15340"                 
## [138,] "GGB3653_SGB4964|SGB4964"                           
## [139,] "Intestinimonas_butyriciproducens|SGB15126"         
## [140,] "GGB9719_SGB15272|SGB15272"                         
## [141,] "Blautia_glucerasea|SGB4816"                        
## [142,] "GGB2653_SGB3574|SGB3574"                           
## [143,] "Veillonella_dispar|SGB6952"                        
## [144,] "bacterium_210917_DFI_7_65|SGB14999"                
## [145,] "GGB9646_SGB15123|SGB15123"                         
## [146,] "GGB51441_SGB71759|SGB71759"                        
## [147,] "GGB33512_SGB15203|SGB15203"                        
## [148,] "Alistipes_indistinctus|SGB2325"                    
## [149,] "Candidatus_Borkfalkia_ceftriaxoniphila|SGB14027"   
## [150,] "GGB9062_SGB13981|SGB13981"                         
## [151,] "Lachnospiraceae_bacterium|SGB4781"                 
## [152,] "Phocea_massiliensis|SGB14837"                      
## [153,] "Guopingia_tenuis|SGB14127"                         
## [154,] "GGB3109_SGB4121|SGB4121"                           
## [155,] "GGB9534_SGB14937|SGB14937"                         
## [156,] "Anaerotruncus_rubiinfantis|SGB25416"               
## [157,] "Lachnospiraceae_bacterium|SGB4953"                 
## [158,] "Oscillibacter_sp_PC13|SGB7258"                     
## [159,] "Blautia_SGB4815|SGB4815"                           
## [160,] "Lawsonibacter_hominis|SGB15131"                    
## [161,] "Clostridium_SGB4750|SGB4750"                       
## [162,] "Holdemania_filiformis|SGB4046"                     
## [163,] "Anaeromassilibacillus_senegalensis|SGB14894"       
## [164,] "Lachnospira_pectinoschiza|SGB5089"                 
## [165,] "Anaerofilum_hominis|SGB79822"                      
## [166,] "Lachnotalea_sp_AF33_28|SGB5200"                    
## [167,] "Senegalimassilia_anaerobia|SGB14824_group"         
## [168,] "GGB9345_SGB14311|SGB14311"                         
## [169,] "Blautia_obeum|SGB4811"                             
## [170,] "Ligaoa_zhengdingensis|SGB14839"                    
## [171,] "Adlercreutzia_equolifaciens|SGB14797"              
## [172,] "Blautia_wexlerae|SGB4837"                          
## [173,] "GGB9787_SGB15410|SGB15410"                         
## [174,] "GGB36331_SGB15121|SGB15121"                        
## [175,] "Clostridium_sp_AM22_11AC|SGB4749"                  
## [176,] "Ruminococcus_sp_AF41_9|SGB25497"                   
## [177,] "Anaerotruncus_colihominis|SGB14963"                
## [178,] "Eubacterium_ramulus|SGB4959"                       
## [179,] "Enterocloster_lavalensis|SGB4725"                  
## [180,] "GGB2848_SGB3813|SGB3813"                           
## [181,] "Blautia_faecis|SGB4820"                            
## [182,] "Clostridiaceae_bacterium_Marseille_Q4145|SGB4769"  
## [183,] "Roseburia_sp_AF02_12|SGB4938"                      
## [184,] "Butyricimonas_faecihominis|SGB1786"                
## [185,] "Youxingia_wuxianensis|SGB82503"                    
## [186,] "Intestinimonas_gabonensis|SGB79840"                
## [187,] "GGB9064_SGB13983|SGB13983"                         
## [188,] "GGB45613_SGB63326|SGB63326"                        
## [189,] "GGB4552_SGB6276|SGB6276"                           
## [190,] "Blautia_massiliensis|SGB4826"                      
## [191,] "Lachnospiraceae_bacterium_CLA_AA_H215|SGB4777"     
## [192,] "Dorea_formicigenerans|SGB4575"                     
## [193,] "Clostridia_bacterium_UC5_1_1D1|SGB14995"           
## [194,] "GGB9765_SGB15382|SGB15382"                         
## [195,] "GGB9770_SGB15390|SGB15390"                         
## [196,] "Clostridium_sp_AM49_4BH|SGB4652"                   
## [197,] "GGB3537_SGB4727|SGB4727"                           
## [198,] "Wansuia_hejianensis|SGB25431"                      
## [199,] "Coprococcus_eutactus|SGB5121"                      
## [200,] "Coprobacter_secundus|SGB1962"                      
## [201,] "GGB3510_SGB4687|SGB4687"                           
## [202,] "Mediterraneibacter_faecis|SGB4563"                 
## [203,] "Ruminococcus_torques|SGB4608"                      
## [204,] "Coprococcus_comes|SGB4577"                         
## [205,] "Streptococcus_parasanguinis|SGB8071"               
## [206,] "Provencibacterium_massiliense|SGB14838"            
## [207,] "Eubacterium_ventriosum|SGB5045"                    
## [208,] "Enterocloster_citroniae|SGB4761"                   
## [209,] "GGB51647_SGB4348|SGB4348"                          
## [210,] "GGB9635_SGB15106|SGB15106"                         
## [211,] "GGB9758_SGB15368|SGB15368"                         
## [212,] "GGB9708_SGB15234|SGB15234"                         
## [213,] "GGB3304_SGB4367|SGB4367"                           
## [214,] "Wujia_chipingensis|SGB5111"                        
## [215,] "Clostridiaceae_bacterium_Marseille_Q4149|SGB15091" 
## [216,] "Paraprevotella_clara|SGB1798"                      
## [217,] "Agathobaculum_butyriciproducens|SGB14991"          
## [218,] "GGB9296_SGB14253|SGB14253"                         
## [219,] "Butyricimonas_paravirosa|SGB1785"                  
## [220,] "Dorea_longicatena|SGB4581"                         
## [221,] "Ruminococcus_lactaris|SGB4557"                     
## [222,] "Alistipes_dispar|SGB2311"                          
## [223,] "Dysosmobacter_SGB15077|SGB15077"                   
## [224,] "GGB13489_SGB15224|SGB15224"                        
## [225,] "Clostridium_sp_AF12_28|SGB4715"                    
## [226,] "GGB34797_SGB14322|SGB14322"                        
## [227,] "Anaerobutyricum_hallii|SGB4532"                    
## [228,] "Clostridium_SGB48024|SGB48024"                     
## [229,] "Slackia_isoflavoniconvertens|SGB14773"             
## [230,] "GGB3321_SGB4394|SGB4394"                           
## [231,] "GGB9237_SGB14179|SGB14179"                         
## [232,] "Anaerotruncus_massiliensis|SGB14965"               
## [233,] "GGB9531_SGB14932|SGB14932"                         
## [234,] "GGB3619_SGB4894|SGB4894"                           
## [235,] "Streptococcus_salivarius|SGB8007_group"            
## [236,] "Oscillibacter_valericigenes|SGB15076"              
## [237,] "Anaerostipes_hadrus|SGB4540"                       
## [238,] "Blautia_wexlerae|SGB4831"                          
## [239,] "GGB2970_SGB3952|SGB3952"                           
## [240,] "Lawsonibacter_SGB15145|SGB15145"                   
## [241,] "GGB9563_SGB14975|SGB14975"                         
## [242,] "Blautia_glucerasea|SGB4804"                        
## [243,] "Anaerostipes_hadrus|SGB4547"                       
## [244,] "Lachnospiraceae_bacterium|SGB4782"                 
## [245,] "GGB2998_SGB3988|SGB3988"                           
## [246,] "GGB9616_SGB15052|SGB15052"                         
## [247,] "GGB9524_SGB14924|SGB14924"                         
## [248,] "Segatella_copri|SGB1626"                           
## [249,] "Mediterraneibacter_butyricigenes|SGB25493"         
## [250,] "Haemophilus_parainfluenzae|SGB9712"                
## [251,] "Anaerosacchariphilus_sp_NSJ_68|SGB4772"            
## [252,] "Anaerotignum_sp_MSJ_24|SGB5180"                    
## [253,] "Dorea_sp_AF36_15AT|SGB4552"                        
## [254,] "GGB58158_SGB79798|SGB79798"                        
## [255,] "Pseudoflavonifractor_capillosus|SGB15140"          
## [256,] "Oliverpabstia_intestinalis|SGB4868"                
## [257,] "Veillonella_parvula|SGB6939"                       
## [258,] "Colidextribacter_sp_210702_DFI_3_9|SGB15146"       
## [259,] "Coprobacter_fastidiosus|SGB1963"                   
## [260,] "Veillonella_rogosae|SGB6956"                       
## [261,] "Eubacteriaceae_bacterium|SGB3958"                  
## [262,] "TM7_phylum_sp_oral_taxon_352|SGB19860_group"       
## [263,] "Enterocloster_hominis|SGB4721"                     
## [264,] "Rothia_mucilaginosa|SGB16971_group"                
## [265,] "Blautia_luti|SGB4832"                              
## [266,] "Blautia_sp_OF03_15BH|SGB4779"                      
## [267,] "Streptococcus_australis|SGB8059_group"             
## [268,] "Ruminococcus_gnavus|SGB4571"                       
## [269,] "Streptococcus_thermophilus|SGB8002"                
## [270,] "Roseburia_hominis|SGB4659"                         
## [271,] "Lentihominibacter_faecis|SGB3957"                  
## [272,] "Actinomyces_sp_ICM47|SGB17167_group"               
## [273,] "Evtepia_gabavorous|SGB15120"                       
## [274,] "Blautia_obeum|SGB4810"                             
## [275,] "GGB3480_SGB4648|SGB4648"                           
## [276,] "Ellagibacter_isourolithinifaciens|SGB14816_group"  
## [277,] "GGB2982_SGB3964|SGB3964"                           
## [278,] "Anaerobutyricum_soehngenii|SGB4537"                
## [279,] "Bacteroides_xylanisolvens|SGB1867"                 
## [280,] "Neobittarella_massiliensis|SGB7264"                
## [281,] "Faecalibacillus_intestinalis|SGB6754"              
## [282,] "GGB9708_SGB15233|SGB15233"                         
## [283,] "Enterocloster_asparagiformis|SGB4724"              
## [284,] "GGB51884_SGB49168|SGB49168"                        
## [285,] "Enterocloster_aldenensis|SGB4762"                  
## [286,] "Eggerthella_lenta|SGB14809"                        
## [287,] "Bacteroides_fragilis|SGB1855"                      
## [288,] "Veillonella_atypica|SGB6936"                       
## [289,] "Enterocloster_bolteae|SGB4758"                     
## [290,] "Romboutsia_timonensis|SGB6148"                     
## [291,] "Roseburia_lenta|SGB4957"                           
## [292,] "GGB3288_SGB4342|SGB4342"                           
## [293,] "GGB9574_SGB14987|SGB14987"                         
## [294,] "Dorea_longicatena|SGB4582"                         
## [295,] "Eubacterium_sp_AF34_35BH|SGB5051"                  
## [296,] "GGB9640_SGB15115|SGB15115"                         
## [297,] "Clostridiaceae_unclassified_SGB4771|SGB4771"       
## [298,] "GGB9766_SGB15383|SGB15383"                         
## [299,] "Lachnospiraceae_bacterium_OM04_12BH|SGB4893"       
## [300,] "Clostridium_SGB6173|SGB6173"                       
## [301,] "GGB9616_SGB15051|SGB15051"                         
## [302,] "Intestinibacter_bartlettii|SGB6140"                
##        [,2]                                                
##   [1,] "Sample"                                            
##   [2,] "Data"                                              
##   [3,] "Diet"                                              
##   [4,] "Bacteroides_stercoris|SGB1830"                     
##   [5,] "Alistipes_putredinis|SGB2318"                      
##   [6,] "Candidatus_Cibionibacter_quicibialis|SGB15286"     
##   [7,] "Bacteroides_uniformis|SGB1836"                     
##   [8,] "Eubacterium_siraeum|SGB4198"                       
##   [9,] "GGB9602_SGB15031|SGB15031"                         
##  [10,] "Phocaeicola_vulgatus|SGB1814"                      
##  [11,] "Faecalibacterium_prausnitzii|SGB15316"             
##  [12,] "Lachnospiraceae_bacterium_CLA_AA_H244|SGB4993"     
##  [13,] "Sutterella_wadsworthensis|SGB9283"                 
##  [14,] "Oscillibacter_sp_ER4|SGB15254"                     
##  [15,] "Parabacteroides_distasonis|SGB1934"                
##  [16,] "Parabacteroides_merdae|SGB1949"                    
##  [17,] "Oscillibacter_valericigenes|SGB15053"              
##  [18,] "GGB47687_SGB2286|SGB2286"                          
##  [19,] "Brotolimicola_acetigignens|SGB4914"                
##  [20,] "GGB9747_SGB15356|SGB15356"                         
##  [21,] "Faecalibacterium_prausnitzii|SGB15339"             
##  [22,] "Alistipes_shahii|SGB2295"                          
##  [23,] "Alistipes_onderdonkii|SGB2303"                     
##  [24,] "Faecalibacterium_SGB15346|SGB15346"                
##  [25,] "Dialister_invisus|SGB5825_group"                   
##  [26,] "Faecalibacterium_prausnitzii|SGB15332"             
##  [27,] "Gemmiger_formicilis|SGB15300"                      
##  [28,] "GGB33469_SGB15236|SGB15236"                        
##  [29,] "Bacteroides_eggerthii|SGB1829"                     
##  [30,] "Coprococcus_eutactus|SGB5117"                      
##  [31,] "Alistipes_communis|SGB2290"                        
##  [32,] "Faecalibacterium_prausnitzii|SGB15322"             
##  [33,] "Escherichia_coli|SGB10068"                         
##  [34,] "Faecalibacterium_prausnitzii|SGB15318"             
##  [35,] "GGB3175_SGB4191|SGB4191"                           
##  [36,] "Faecalibacterium_prausnitzii|SGB15342"             
##  [37,] "Fusicatenibacter_saccharivorans|SGB4874"           
##  [38,] "Intestinimonas_massiliensis|SGB15127"              
##  [39,] "Vescimonas_coprocola|SGB15089"                     
##  [40,] "Oscillibacter_sp_MSJ_31|SGB15249"                  
##  [41,] "Bacteroides_caccae|SGB1877"                        
##  [42,] "Oscillospiraceae_bacterium_CLA_AA_H250|SGB14861"   
##  [43,] "Bifidobacterium_longum|SGB17248"                   
##  [44,] "Alistipes_senegalensis|SGB2296"                    
##  [45,] "Roseburia_intestinalis|SGB4951"                    
##  [46,] "Dysosmobacter_welbionis|SGB15078"                  
##  [47,] "Eubacterium_rectale|SGB4933"                       
##  [48,] "Alistipes_ihumii|SGB2328"                          
##  [49,] "Roseburia_faecis|SGB4925"                          
##  [50,] "GGB9699_SGB15216|SGB15216"                         
##  [51,] "Oscillospiraceae_bacterium|SGB15225"               
##  [52,] "Faecalibacterium_prausnitzii|SGB15317"             
##  [53,] "GGB9715_SGB15267|SGB15267"                         
##  [54,] "GGB9667_SGB15164|SGB15164"                         
##  [55,] "Lachnospira_pectinoschiza|SGB5075"                 
##  [56,] "Bacteroides_clarus|SGB1832"                        
##  [57,] "Anaerotignum_faecicola|SGB5190"                    
##  [58,] "Hydrogenoanaerobacterium_saccharovorans|SGB15350"  
##  [59,] "Bacteroides_faecis|SGB1860"                        
##  [60,] "Roseburia_hominis|SGB4936"                         
##  [61,] "GGB9712_SGB15244|SGB15244"                         
##  [62,] "Ruminococcus_bicirculans|SGB4262"                  
##  [63,] "Bilophila_wadsworthia|SGB15452"                    
##  [64,] "Odoribacter_splanchnicus|SGB1790"                  
##  [65,] "GGB9365_SGB14341|SGB14341"                         
##  [66,] "GGB9715_SGB15265|SGB15265"                         
##  [67,] "GGB9730_SGB15291|SGB15291"                         
##  [68,] "Bifidobacterium_adolescentis|SGB17244"             
##  [69,] "Bacteroides_ovatus|SGB1871"                        
##  [70,] "GGB9614_SGB15049|SGB15049"                         
##  [71,] "Simiaoa_sunii|SGB4910"                             
##  [72,] "GGB9759_SGB15370|SGB15370"                         
##  [73,] "Barnesiella_intestinihominis|SGB1965"              
##  [74,] "Agathobaculum_butyriciproducens|SGB14993"          
##  [75,] "Lachnospira_eligens|SGB5082"                       
##  [76,] "Lacrimispora_amygdalina|SGB4716"                   
##  [77,] "Akkermansia_muciniphila|SGB9226"                   
##  [78,] "Clostridiales_bacterium|SGB15143"                  
##  [79,] "GGB6649_SGB9391|SGB9391"                           
##  [80,] "Ruminococcus_bromii|SGB4285"                       
##  [81,] "Gemmiger_formicilis|SGB15299"                      
##  [82,] "Alistipes_sp_AF17_16|SGB2326"                      
##  [83,] "Clostridiaceae_bacterium|SGB4269"                  
##  [84,] "GGB9760_SGB15373|SGB15373"                         
##  [85,] "Clostridiaceae_bacterium_AF18_31LB|SGB4767"        
##  [86,] "GGB9621_SGB15073|SGB15073"                         
##  [87,] "Bacteroides_cellulosilyticus|SGB1844"              
##  [88,] "Faecalibacterium_sp_CLA_AA_H233|SGB15315"          
##  [89,] "Ruthenibacterium_lactatiformans|SGB15271"          
##  [90,] "Lawsonibacter_asaccharolyticus|SGB15154"           
##  [91,] "Clostridium_sp_AF20_17LB|SGB4714"                  
##  [92,] "Phocaeicola_massiliensis|SGB1812"                  
##  [93,] "GGB9760_SGB15374|SGB15374"                         
##  [94,] "Clostridium_fessum|SGB4705"                        
##  [95,] "GGB33586_SGB53517|SGB53517"                        
##  [96,] "Clostridium_sp_AF36_4|SGB4644"                     
##  [97,] "Clostridium_SGB4909|SGB4909"                       
##  [98,] "GGB13404_SGB14252|SGB14252"                        
##  [99,] "GGB9627_SGB15081|SGB15081"                         
## [100,] "GGB9608_SGB15041|SGB15041"                         
## [101,] "GGB9707_SGB15229|SGB15229"                         
## [102,] "Parasutterella_excrementihominis|SGB9262"          
## [103,] "Roseburia_inulinivorans|SGB4940"                   
## [104,] "Phocaeicola_dorei|SGB1815"                         
## [105,] "Oscillibacter_valericigenes|SGB15124"              
## [106,] "GGB9502_SGB14899|SGB14899"                         
## [107,] "GGB9818_SGB15459|SGB15459"                         
## [108,] "GGB9767_SGB15385|SGB15385"                         
## [109,] "Clostridiaceae_bacterium|SGB4770"                  
## [110,] "Bacteroides_thetaiotaomicron|SGB1861"              
## [111,] "GGB9559_SGB14969|SGB14969"                         
## [112,] "Flavonifractor_plautii|SGB15132"                   
## [113,] "Fusicatenibacter_sp_CLA_AA_H277|SGB4780"           
## [114,] "Lachnospiraceae_bacterium_AM48_27BH|SGB4706"       
## [115,] "GGB9063_SGB13982|SGB13982"                         
## [116,] "Collinsella_aerofaciens|SGB14535"                  
## [117,] "GGB9522_SGB14921|SGB14921"                         
## [118,] "GGB9619_SGB15067|SGB15067"                         
## [119,] "Clostridium_leptum|SGB14853"                       
## [120,] "GGB52130_SGB14966|SGB14966"                        
## [121,] "Clostridium_sp_AF27_2AA|SGB4712"                   
## [122,] "Blautia_sp_MCC283|SGB4828"                         
## [123,] "Clostridiales_bacterium_KLE1615|SGB5090"           
## [124,] "Lachnospira_sp_NSJ_43|SGB5087"                     
## [125,] "Butyricimonas_virosa|SGB1784"                      
## [126,] "GGB9509_SGB14906|SGB14906"                         
## [127,] "Coprococcus_catus|SGB4670"                         
## [128,] "GGB4585_SGB6340|SGB6340"                           
## [129,] "Phascolarctobacterium_faecium|SGB5792"             
## [130,] "Clostridiaceae_bacterium_Marseille_Q4143|SGB4768"  
## [131,] "GGB45432_SGB63101|SGB63101"                        
## [132,] "Clostridium_sp_AM33_3|SGB4711"                     
## [133,] "GGB9342_SGB14306|SGB14306"                         
## [134,] "Faecalicatena_fissicatena|SGB4871"                 
## [135,] "Alistipes_finegoldii|SGB2301"                      
## [136,] "Oscillospiraceae_bacterium_Marseille_Q3528|SGB4778"
## [137,] "Faecalibacterium_sp_HTFF|SGB15340"                 
## [138,] "GGB3653_SGB4964|SGB4964"                           
## [139,] "Intestinimonas_butyriciproducens|SGB15126"         
## [140,] "GGB9719_SGB15272|SGB15272"                         
## [141,] "Blautia_glucerasea|SGB4816"                        
## [142,] "GGB2653_SGB3574|SGB3574"                           
## [143,] "Veillonella_dispar|SGB6952"                        
## [144,] "bacterium_210917_DFI_7_65|SGB14999"                
## [145,] "GGB9646_SGB15123|SGB15123"                         
## [146,] "GGB51441_SGB71759|SGB71759"                        
## [147,] "GGB33512_SGB15203|SGB15203"                        
## [148,] "Alistipes_indistinctus|SGB2325"                    
## [149,] "Candidatus_Borkfalkia_ceftriaxoniphila|SGB14027"   
## [150,] "GGB9062_SGB13981|SGB13981"                         
## [151,] "Lachnospiraceae_bacterium|SGB4781"                 
## [152,] "Phocea_massiliensis|SGB14837"                      
## [153,] "Guopingia_tenuis|SGB14127"                         
## [154,] "GGB3109_SGB4121|SGB4121"                           
## [155,] "GGB9534_SGB14937|SGB14937"                         
## [156,] "Anaerotruncus_rubiinfantis|SGB25416"               
## [157,] "Lachnospiraceae_bacterium|SGB4953"                 
## [158,] "Oscillibacter_sp_PC13|SGB7258"                     
## [159,] "Blautia_SGB4815|SGB4815"                           
## [160,] "Lawsonibacter_hominis|SGB15131"                    
## [161,] "Clostridium_SGB4750|SGB4750"                       
## [162,] "Holdemania_filiformis|SGB4046"                     
## [163,] "Anaeromassilibacillus_senegalensis|SGB14894"       
## [164,] "Lachnospira_pectinoschiza|SGB5089"                 
## [165,] "Anaerofilum_hominis|SGB79822"                      
## [166,] "Lachnotalea_sp_AF33_28|SGB5200"                    
## [167,] "Senegalimassilia_anaerobia|SGB14824_group"         
## [168,] "GGB9345_SGB14311|SGB14311"                         
## [169,] "Blautia_obeum|SGB4811"                             
## [170,] "Ligaoa_zhengdingensis|SGB14839"                    
## [171,] "Adlercreutzia_equolifaciens|SGB14797"              
## [172,] "Blautia_wexlerae|SGB4837"                          
## [173,] "GGB9787_SGB15410|SGB15410"                         
## [174,] "GGB36331_SGB15121|SGB15121"                        
## [175,] "Clostridium_sp_AM22_11AC|SGB4749"                  
## [176,] "Ruminococcus_sp_AF41_9|SGB25497"                   
## [177,] "Anaerotruncus_colihominis|SGB14963"                
## [178,] "Eubacterium_ramulus|SGB4959"                       
## [179,] "Enterocloster_lavalensis|SGB4725"                  
## [180,] "GGB2848_SGB3813|SGB3813"                           
## [181,] "Blautia_faecis|SGB4820"                            
## [182,] "Clostridiaceae_bacterium_Marseille_Q4145|SGB4769"  
## [183,] "Roseburia_sp_AF02_12|SGB4938"                      
## [184,] "Butyricimonas_faecihominis|SGB1786"                
## [185,] "Youxingia_wuxianensis|SGB82503"                    
## [186,] "Intestinimonas_gabonensis|SGB79840"                
## [187,] "GGB9064_SGB13983|SGB13983"                         
## [188,] "GGB45613_SGB63326|SGB63326"                        
## [189,] "GGB4552_SGB6276|SGB6276"                           
## [190,] "Blautia_massiliensis|SGB4826"                      
## [191,] "Lachnospiraceae_bacterium_CLA_AA_H215|SGB4777"     
## [192,] "Dorea_formicigenerans|SGB4575"                     
## [193,] "Clostridia_bacterium_UC5_1_1D1|SGB14995"           
## [194,] "GGB9765_SGB15382|SGB15382"                         
## [195,] "GGB9770_SGB15390|SGB15390"                         
## [196,] "Clostridium_sp_AM49_4BH|SGB4652"                   
## [197,] "GGB3537_SGB4727|SGB4727"                           
## [198,] "Wansuia_hejianensis|SGB25431"                      
## [199,] "Coprococcus_eutactus|SGB5121"                      
## [200,] "Coprobacter_secundus|SGB1962"                      
## [201,] "GGB3510_SGB4687|SGB4687"                           
## [202,] "Mediterraneibacter_faecis|SGB4563"                 
## [203,] "Ruminococcus_torques|SGB4608"                      
## [204,] "Coprococcus_comes|SGB4577"                         
## [205,] "Streptococcus_parasanguinis|SGB8071"               
## [206,] "Provencibacterium_massiliense|SGB14838"            
## [207,] "Eubacterium_ventriosum|SGB5045"                    
## [208,] "Enterocloster_citroniae|SGB4761"                   
## [209,] "GGB51647_SGB4348|SGB4348"                          
## [210,] "GGB9635_SGB15106|SGB15106"                         
## [211,] "GGB9758_SGB15368|SGB15368"                         
## [212,] "GGB9708_SGB15234|SGB15234"                         
## [213,] "GGB3304_SGB4367|SGB4367"                           
## [214,] "Wujia_chipingensis|SGB5111"                        
## [215,] "Clostridiaceae_bacterium_Marseille_Q4149|SGB15091" 
## [216,] "Paraprevotella_clara|SGB1798"                      
## [217,] "Agathobaculum_butyriciproducens|SGB14991"          
## [218,] "GGB9296_SGB14253|SGB14253"                         
## [219,] "Butyricimonas_paravirosa|SGB1785"                  
## [220,] "Dorea_longicatena|SGB4581"                         
## [221,] "Ruminococcus_lactaris|SGB4557"                     
## [222,] "Alistipes_dispar|SGB2311"                          
## [223,] "Dysosmobacter_SGB15077|SGB15077"                   
## [224,] "GGB13489_SGB15224|SGB15224"                        
## [225,] "Clostridium_sp_AF12_28|SGB4715"                    
## [226,] "GGB34797_SGB14322|SGB14322"                        
## [227,] "Anaerobutyricum_hallii|SGB4532"                    
## [228,] "Clostridium_SGB48024|SGB48024"                     
## [229,] "Slackia_isoflavoniconvertens|SGB14773"             
## [230,] "GGB3321_SGB4394|SGB4394"                           
## [231,] "GGB9237_SGB14179|SGB14179"                         
## [232,] "Anaerotruncus_massiliensis|SGB14965"               
## [233,] "GGB9531_SGB14932|SGB14932"                         
## [234,] "GGB3619_SGB4894|SGB4894"                           
## [235,] "Streptococcus_salivarius|SGB8007_group"            
## [236,] "Oscillibacter_valericigenes|SGB15076"              
## [237,] "Anaerostipes_hadrus|SGB4540"                       
## [238,] "Blautia_wexlerae|SGB4831"                          
## [239,] "GGB2970_SGB3952|SGB3952"                           
## [240,] "Lawsonibacter_SGB15145|SGB15145"                   
## [241,] "GGB9563_SGB14975|SGB14975"                         
## [242,] "Blautia_glucerasea|SGB4804"                        
## [243,] "Anaerostipes_hadrus|SGB4547"                       
## [244,] "Lachnospiraceae_bacterium|SGB4782"                 
## [245,] "GGB2998_SGB3988|SGB3988"                           
## [246,] "GGB9616_SGB15052|SGB15052"                         
## [247,] "GGB9524_SGB14924|SGB14924"                         
## [248,] "Segatella_copri|SGB1626"                           
## [249,] "Mediterraneibacter_butyricigenes|SGB25493"         
## [250,] "Haemophilus_parainfluenzae|SGB9712"                
## [251,] "Anaerosacchariphilus_sp_NSJ_68|SGB4772"            
## [252,] "Anaerotignum_sp_MSJ_24|SGB5180"                    
## [253,] "Dorea_sp_AF36_15AT|SGB4552"                        
## [254,] "GGB58158_SGB79798|SGB79798"                        
## [255,] "Pseudoflavonifractor_capillosus|SGB15140"          
## [256,] "Oliverpabstia_intestinalis|SGB4868"                
## [257,] "Veillonella_parvula|SGB6939"                       
## [258,] "Colidextribacter_sp_210702_DFI_3_9|SGB15146"       
## [259,] "Coprobacter_fastidiosus|SGB1963"                   
## [260,] "Veillonella_rogosae|SGB6956"                       
## [261,] "Eubacteriaceae_bacterium|SGB3958"                  
## [262,] "TM7_phylum_sp_oral_taxon_352|SGB19860_group"       
## [263,] "Enterocloster_hominis|SGB4721"                     
## [264,] "Rothia_mucilaginosa|SGB16971_group"                
## [265,] "Blautia_luti|SGB4832"                              
## [266,] "Blautia_sp_OF03_15BH|SGB4779"                      
## [267,] "Streptococcus_australis|SGB8059_group"             
## [268,] "Ruminococcus_gnavus|SGB4571"                       
## [269,] "Streptococcus_thermophilus|SGB8002"                
## [270,] "Roseburia_hominis|SGB4659"                         
## [271,] "Lentihominibacter_faecis|SGB3957"                  
## [272,] "Actinomyces_sp_ICM47|SGB17167_group"               
## [273,] "Evtepia_gabavorous|SGB15120"                       
## [274,] "Blautia_obeum|SGB4810"                             
## [275,] "GGB3480_SGB4648|SGB4648"                           
## [276,] "Ellagibacter_isourolithinifaciens|SGB14816_group"  
## [277,] "GGB2982_SGB3964|SGB3964"                           
## [278,] "Anaerobutyricum_soehngenii|SGB4537"                
## [279,] "Bacteroides_xylanisolvens|SGB1867"                 
## [280,] "Neobittarella_massiliensis|SGB7264"                
## [281,] "Faecalibacillus_intestinalis|SGB6754"              
## [282,] "GGB9708_SGB15233|SGB15233"                         
## [283,] "Enterocloster_asparagiformis|SGB4724"              
## [284,] "GGB51884_SGB49168|SGB49168"                        
## [285,] "Enterocloster_aldenensis|SGB4762"                  
## [286,] "Eggerthella_lenta|SGB14809"                        
## [287,] "Bacteroides_fragilis|SGB1855"                      
## [288,] "Veillonella_atypica|SGB6936"                       
## [289,] "Enterocloster_bolteae|SGB4758"                     
## [290,] "Romboutsia_timonensis|SGB6148"                     
## [291,] "Roseburia_lenta|SGB4957"                           
## [292,] "GGB3288_SGB4342|SGB4342"                           
## [293,] "GGB9574_SGB14987|SGB14987"                         
## [294,] "Dorea_longicatena|SGB4582"                         
## [295,] "Eubacterium_sp_AF34_35BH|SGB5051"                  
## [296,] "GGB9640_SGB15115|SGB15115"                         
## [297,] "Clostridiaceae_unclassified_SGB4771|SGB4771"       
## [298,] "GGB9766_SGB15383|SGB15383"                         
## [299,] "Lachnospiraceae_bacterium_OM04_12BH|SGB4893"       
## [300,] "Clostridium_SGB6173|SGB6173"                       
## [301,] "GGB9616_SGB15051|SGB15051"                         
## [302,] "Intestinibacter_bartlettii|SGB6140"

common_microbiome <- intersect(
  colnames(data_microbiome_original), 
  colnames(data_microbiome_validation))[-c(1:2)]

tr1 <- data_microbiome_original %>% 
  mutate(Data = if_else(Country == 'CZ', 'CZ_tr', 'IT_tr')) %>% 
  select(Data, Diet, all_of(common_microbiome))

tr2 <- data_microbiome_validation %>% 
  mutate(Data = 'valid',
         Diet = Diet) %>% 
  select(Data, Diet, all_of(common_microbiome))

data_merged <- bind_rows(tr1, tr2)

data_microbiome_validation$`Wujia_chipingenss|SGB5111`
## NULL
data_microbiome_validation$`Wujia_chipingensis|SGB5111`
##   [1] -2.2659270 -1.8366784 -1.5719579 -1.5649595 -1.3519655 -1.9273808
##   [7] -0.5872619 -1.9902827 -0.9982257 -1.6966210 -2.1930395 -0.1319288
##  [13] -2.1260143 -1.9245373 -2.0088636  0.5526297  3.9926563 -0.7907929
##  [19] -2.1891711 -1.9395276 -2.1353811 -2.1654279 -2.2360908 -0.3930113
##  [25] -2.3957911 -0.2977329 -1.4533578 -1.4072936 -2.0794670  1.8049062
##  [31]  1.5258843  3.6617309  3.8680214 -2.0412347  0.4626366  0.3866172
##  [37]  5.8044209 -2.2201075 -0.4444399 -2.1377634 -2.1399921 -2.0933531
##  [43]  4.5931692 -1.6871195  2.8164681 -2.5644101 -2.0585439 -2.6629664
##  [49]  0.6723405  4.4534434  4.1610574  5.3832639 -1.6933290 -2.5382040
##  [55] -2.4107280 -2.1198776  5.3413842 -1.8967301 -1.5487724 -2.1396917
##  [61] -1.7628668  4.1335815  5.0872100  3.6031291 -1.8040730  4.9799740
##  [67] -0.9850500 -1.5123666 -0.5770878 -1.7964678 -2.1354643  2.3798401
##  [73]  5.9776544 -2.2009657 -2.4395272  4.0615507 -1.3941153 -2.3882306
##  [79] -2.2799115  4.6054080 -1.5516203 -0.4367249 -1.5615449  4.0790976
##  [85]  1.0917225  3.7139508 -1.8139512 -1.1749955 -1.6831395 -1.5614355
##  [91] -1.9565894 -0.1425388 -1.0454574 -2.1128182  4.9182020 -1.6631032
##  [97] -2.3185501 -1.6734295 -1.6233100 -1.5638780  3.3149668  4.7828809
## [103] -1.5505336

3.2 Explore

3.2.0.1 Distributions - clr transformed

Open code
check <- data_microbiome_original %>% 
  dplyr::select(-c(Sample:Diet)
    ) %>% 
  na.omit()


size = c(8, 7)
par(mfrow = c(size[1],size[2]))
par(mar=c(2,1.5,2,0.5))
set.seed(16)
ran <- sample(1:ncol(check), size[1]*size[2], replace = FALSE)
for(x in ran){
  hist(check[,x], 
       16, 
       col='blue', 
       main = paste0(colnames(check)[x])
  )
}

Histogram of CLR-transformed proportions for a random selection of 56 bacterial taxa (proportion: number of sequences for each bacterium relative to the total library depth)

3.2.0.2 Taxon proportions accross groups

Open code
colo <- c('#329243', '#F9FFAF')

data_merged <- na.omit(data_merged)

outcomes <- common_microbiome[
    sample(
      1:length(common_microbiome), 35, replace = FALSE
      )
  ]

boxplot_cond <- function(variable) {
  
  
  p <- ggboxplot(data_merged, 
                 x = 'Diet', 
                 y = variable, 
                 fill = 'Diet', 
                 tip.length = 0.15,
                 palette = colo,
                 outlier.shape = 1,
                 lwd = 0.25,
                 outlier.size = 0.8,
                 facet.by = 'Data',
                 title = variable,
                 ylab = 'CLR(taxa proportion)') +
    
    theme(
      plot.title = element_text(size = 10), 
      axis.title = element_text(size = 8),  
      axis.text.y = element_text(size = 7),
      axis.text.x = element_blank(),
      axis.title.x = element_blank()
    ) 
    
  return(p)
}

# Plot all outcomes
plots <- map(outcomes, boxplot_cond)

# Create a matrix of plots
plots_arranged <- ggarrange(plotlist = plots, ncol = 5, nrow = 7,  common.legend = TRUE)
plots_arranged

CLR-trasformed counts in randomly chosen 35 bacterial taxa, across all 3 cohorts (Czech and Italian training cohorts and an independent Czech valdiation cohort) and both dietary groups

4 Linear models across taxa

We will fit a feature-specific linear model where the CLR-transformed count of bacteria reads represents the outcome variable whereas country (Italy vs Czech), diet (vegan vs omnivore), and their interaction (country:diet) all represent fixed-effects predictors. So, each model has the following form

\[ CLR({N_i}) = \alpha + \beta_{1} \times country + \beta_{2} \times diet + \beta_{3} \times country:diet + \epsilon \] where \(N_i\) is read count of \(i\)-th bacteria taxa

The variables were coded as follows: \(diet = -0.5\) for omnivores and \(diet = 0.5\) for vegans; \(country = -0.5\) for the Czech cohort and \(country = 0.5\) for the Italian cohort.
This parameterization allows us to interpret the linear model summary output as presenting the conditional effects of diet averaged across both countries and the conditional effects of country averaged across both diet groups. We will then use the emmeans package (Lenth 2024) to obtain specific estimates for the effect of diet in the Italian and Czech cohorts separately, still from a single model.

Taxa that will show a significant diet effect (average effect of diet across both countries, adjusted for multiple comparisons with FDR < 0.1) will be then visualized using a forest plot, with country-specific diet effect along with diet effect based on independent validation cohort, to evaluate how generalizable these findings are (see external validation section).

Note that p-value for avg effects are the same as produced with car::Anova(model, type = 'III').

4.1 Select data

Open code
data_analysis <- data_microbiome_original %>%
  na.omit() %>%
  dplyr::mutate(
    Diet_VEGAN = as.numeric(
      dplyr::if_else(
        Diet == "VEGAN", 0.5, -0.5
      )
    ),
    Country_IT = as.numeric(
      dplyr::if_else(
        Country == "IT", 0.5, -0.5
      )
    )
  ) %>%
  dplyr::select(
    Sample,
    Country,
    Country_IT,
    Diet,
    Diet_VEGAN,
    dplyr::everything()
  )

summary(data_analysis[ , 1:12])
##     Sample            Country            Country_IT           Diet          
##  Length:166         Length:166         Min.   :-0.50000   Length:166        
##  Class :character   Class :character   1st Qu.:-0.50000   Class :character  
##  Mode  :character   Mode  :character   Median :-0.50000   Mode  :character  
##                                        Mean   :-0.03012                     
##                                        3rd Qu.: 0.50000                     
##                                        Max.   : 0.50000                     
##    Diet_VEGAN       Bacteroides_stercoris|SGB1830 Alistipes_putredinis|SGB2318
##  Min.   :-0.50000   Min.   :-4.8322               Min.   :-5.438              
##  1st Qu.:-0.50000   1st Qu.:-3.6490               1st Qu.: 3.908              
##  Median : 0.50000   Median :-1.3277               Median : 6.130              
##  Mean   : 0.08434   Mean   : 0.5654               Mean   : 4.703              
##  3rd Qu.: 0.50000   3rd Qu.: 4.7830               3rd Qu.: 7.159              
##  Max.   : 0.50000   Max.   : 9.4417               Max.   : 9.897              
##  Candidatus_Cibionibacter_quicibialis|SGB15286 Bacteroides_uniformis|SGB1836
##  Min.   :-2.250                                Min.   :-3.273               
##  1st Qu.: 4.278                                1st Qu.: 4.743               
##  Median : 5.405                                Median : 6.158               
##  Mean   : 5.113                                Mean   : 5.964               
##  3rd Qu.: 6.447                                3rd Qu.: 7.395               
##  Max.   : 9.195                                Max.   :10.043               
##  Eubacterium_siraeum|SGB4198 GGB9602_SGB15031|SGB15031
##  Min.   :-5.694              Min.   :-4.2797          
##  1st Qu.:-4.539              1st Qu.:-2.6107          
##  Median :-2.937              Median : 0.3740          
##  Mean   :-1.269              Mean   : 0.5761          
##  3rd Qu.: 2.030              3rd Qu.: 3.4878          
##  Max.   : 6.338              Max.   : 8.0727          
##  Phocaeicola_vulgatus|SGB1814
##  Min.   :-3.802              
##  1st Qu.: 4.296              
##  Median : 5.775              
##  Mean   : 5.483              
##  3rd Qu.: 7.100              
##  Max.   :10.519

4.1.1 Define number of microbiome and covariates

Open code
n_covarites <- 5
n_features <- ncol(data_analysis) - n_covarites

4.1.2 Create empty objects

Open code
outcome <- vector('double', n_features)
logFD_VGdiet_inCZ <- vector('double', n_features)
logFD_VGdiet_inIT <- vector('double', n_features)
logFD_VGdiet_avg <- vector('double', n_features)

logFD_ITcountry_avg <- vector('double', n_features)
diet_country_int <- vector('double', n_features)


P_VGdiet_inCZ <- vector('double', n_features)
P_VGdiet_inIT <- vector('double', n_features)
P_VGdiet_avg <- vector('double', n_features)

P_ITcountry_avg <- vector('double', n_features)
P_diet_country_int <- vector('double', n_features)


CI_L_VGdiet_inCZ <- vector('double', n_features)
CI_L_VGdiet_inIT <- vector('double', n_features)
CI_L_VGdiet_avg <- vector('double', n_features)

CI_U_VGdiet_inCZ <- vector('double', n_features)
CI_U_VGdiet_inIT <- vector('double', n_features)
CI_U_VGdiet_avg <- vector('double', n_features)

4.1.3 Estimate over outcomes

Open code
for (i in 1:n_features) {
  
  ## define variable
  data_analysis$outcome <- data_analysis[, (i + n_covarites)]

  ## fit model
  model <- lm(outcome ~ Country_IT * Diet_VEGAN, data = data_analysis)

  ## get contrast (effects of diet BY COUNTRY)
  contrast_emm <- summary(
    pairs(
      emmeans(
        model,
        specs = ~ Diet_VEGAN | Country_IT
        ),
      interaction = TRUE,
      adjust = "none"
      ),
    infer = c(TRUE, TRUE)
    )

  ## save results
  outcome[i] <- names(data_analysis)[i + n_covarites]
  
  ## country effect
  logFD_ITcountry_avg[i] <- summary(model)$coefficients[
    which(
      names(model$coefficients) == "Country_IT"
    ), 1
  ]

  P_ITcountry_avg[i] <- summary(model)$coefficients[
    which(
      names(model$coefficients) == "Country_IT"
    ), 4
  ]
  
  
  ## diet effect
  tr <- confint(model)
  
  CI_L_VGdiet_avg[i] <- tr[which(row.names(tr) == 'Diet_VEGAN'),][1]
  CI_U_VGdiet_avg[i] <- tr[which(row.names(tr) == 'Diet_VEGAN'),][2]
  
  logFD_VGdiet_avg[i] <- summary(model)$coefficients[
    which(
      names(model$coefficients) == "Diet_VEGAN"
    ), 1
  ]

  P_VGdiet_avg[i] <- summary(model)$coefficients[
    which(
      names(model$coefficients) == "Diet_VEGAN"
    ), 4
  ]
  
  logFD_VGdiet_inCZ[i] <- -contrast_emm[1,3]
  P_VGdiet_inCZ[i] <- contrast_emm$p.value[1]
  CI_L_VGdiet_inCZ[i] <- -contrast_emm$upper.CL[1]
  CI_U_VGdiet_inCZ[i] <- -contrast_emm$lower.CL[1]
  
  logFD_VGdiet_inIT[i] <- -contrast_emm[2,3]
  P_VGdiet_inIT[i] <- contrast_emm$p.value[2]
  CI_L_VGdiet_inIT[i] <- -contrast_emm$upper.CL[2]
  CI_U_VGdiet_inIT[i] <- -contrast_emm$lower.CL[2]
  
  ## interaction
  diet_country_int[i] <- summary(model)$coefficients[
    which(
      names(model$coefficients) == "Country_IT:Diet_VEGAN"
    ), 1
  ]

  P_diet_country_int[i] <- summary(model)$coefficients[
    which(
      names(model$coefficients) == "Country_IT:Diet_VEGAN"
    ), 4
  ]
}

4.1.4 Results table

Open code
result_microbiome <- data.frame(
  outcome,
  logFD_ITcountry_avg, P_ITcountry_avg,
  logFD_VGdiet_avg, P_VGdiet_avg,
  logFD_VGdiet_inCZ, P_VGdiet_inCZ,
  logFD_VGdiet_inIT, P_VGdiet_inIT,
  diet_country_int, P_diet_country_int,
  CI_L_VGdiet_avg, CI_U_VGdiet_avg,
  CI_L_VGdiet_inCZ, CI_U_VGdiet_inCZ,
  CI_L_VGdiet_inIT, CI_U_VGdiet_inIT
)

4.1.5 Adjust p values

Open code
result_microbiome <- result_microbiome %>% 
  dplyr::mutate(
    fdr_ITcountry_avg = p.adjust(P_ITcountry_avg, method = 'BH'),
    fdr_VGdiet_avg = p.adjust(P_VGdiet_avg, method = 'BH'),
    
    fdr_VGdiet_inCZ = p.adjust(P_VGdiet_inCZ, method = 'BH'),
    fdr_VGdiet_inIT = p.adjust(P_VGdiet_inIT, method = 'BH'),
    fdr_diet_country_int = p.adjust(P_diet_country_int, method = 'BH')
  ) %>% 
  dplyr::select(
    outcome,
    logFD_ITcountry_avg, P_ITcountry_avg, fdr_ITcountry_avg,
    logFD_VGdiet_avg, P_VGdiet_avg, fdr_VGdiet_avg,
    logFD_VGdiet_inCZ, P_VGdiet_inCZ, fdr_VGdiet_inCZ,
    logFD_VGdiet_inIT, P_VGdiet_inIT, fdr_VGdiet_inIT,
    diet_country_int, P_diet_country_int, fdr_diet_country_int,
    CI_L_VGdiet_avg, CI_U_VGdiet_avg,
    CI_L_VGdiet_inCZ, CI_U_VGdiet_inCZ,
    CI_L_VGdiet_inIT, CI_U_VGdiet_inIT
  )

4.1.6 Show and save results

Open code
kableExtra::kable(result_microbiome %>% filter(fdr_VGdiet_avg < 0.05),
  caption = "Result of linear models, modelling the CLR-transformed reads count of given bacterial taxa, with `Diet`, `Country` and `Diet:Country` interaction as predictors. Only bacteria whose CLR-transformed proportion differ between diet (FDR < 0.05, average diet effet across both countries) are shown. `logFD` prefix: implies estimated effects (regression coefficient), i.e. how much CLR-transformed reads count differ in vegans compared to omnivores, `P`: p-value, `fdr`: p-value after adjustment for multiple comparison, `CI_L` and `CI_U`: lower and upper bounds of 95% confidence interval respectively. `avg` suffix shows effect averaged across subgroups, whereas `inCZ` and `inIT` shows effect in Czech or Italian cohort respectively. All estimates in a single row are based on a single model"
)
Result of linear models, modelling the CLR-transformed reads count of given bacterial taxa, with Diet, Country and Diet:Country interaction as predictors. Only bacteria whose CLR-transformed proportion differ between diet (FDR < 0.05, average diet effet across both countries) are shown. logFD prefix: implies estimated effects (regression coefficient), i.e. how much CLR-transformed reads count differ in vegans compared to omnivores, P: p-value, fdr: p-value after adjustment for multiple comparison, CI_L and CI_U: lower and upper bounds of 95% confidence interval respectively. avg suffix shows effect averaged across subgroups, whereas inCZ and inIT shows effect in Czech or Italian cohort respectively. All estimates in a single row are based on a single model
outcome logFD_ITcountry_avg P_ITcountry_avg fdr_ITcountry_avg logFD_VGdiet_avg P_VGdiet_avg fdr_VGdiet_avg logFD_VGdiet_inCZ P_VGdiet_inCZ fdr_VGdiet_inCZ logFD_VGdiet_inIT P_VGdiet_inIT fdr_VGdiet_inIT diet_country_int P_diet_country_int fdr_diet_country_int CI_L_VGdiet_avg CI_U_VGdiet_avg CI_L_VGdiet_inCZ CI_U_VGdiet_inCZ CI_L_VGdiet_inIT CI_U_VGdiet_inIT
Escherichia_coli|SGB10068 0.2619264 0.5866636 0.6604881 -1.6056687 0.0010415 0.0124560 -1.7212276 0.0123629 0.1114726 -1.4901099 0.0297547 0.2433353 0.2311177 0.8103673 0.9929048 -2.5551229 -0.6562146 -3.0647170 -0.3777381 -2.8320819 -0.1481379
Bacteroides_clarus|SGB1832 1.9131572 0.0002848 0.0008431 -1.9852453 0.0001701 0.0031781 -1.6674895 0.0236093 0.1534603 -2.3030012 0.0018864 0.0961993 -0.6355117 0.5386671 0.9929048 -3.0036467 -0.9668440 -3.1085401 -0.2264388 -3.7424241 -0.8635783
Hydrogenoanaerobacterium_saccharovorans|SGB15350 1.5011518 0.0012291 0.0032813 -1.3582916 0.0033582 0.0313785 -1.3329832 0.0405517 0.2090509 -1.3835999 0.0334061 0.2561137 -0.0506167 0.9558340 0.9929048 -2.2592946 -0.4572885 -2.6079137 -0.0580527 -2.6570904 -0.1101095
GGB6649_SGB9391|SGB9391 0.2325600 0.5179518 0.6002620 1.0140390 0.0053203 0.0418620 1.7803147 0.0005900 0.0117612 0.2477634 0.6259425 0.9452363 -1.5325513 0.0342745 0.5717111 0.3052515 1.7228266 0.7773715 2.7832579 -0.7540470 1.2495737
Ruthenibacterium_lactatiformans|SGB15271 0.9036294 0.0030971 0.0073356 -0.8414746 0.0057914 0.0444010 -1.5432722 0.0003871 0.0100784 -0.1396771 0.7430182 0.9689172 1.4035951 0.0209147 0.5717111 -1.4356716 -0.2472776 -2.3840684 -0.7024759 -0.9795236 0.7001695
Lawsonibacter_asaccharolyticus|SGB15154 2.8249857 0.0000000 0.0000000 -1.5247873 0.0000971 0.0023730 -1.9303765 0.0004595 0.0105548 -1.1191981 0.0394843 0.2775605 0.8111784 0.2892321 0.9724308 -2.2780362 -0.7715384 -2.9962332 -0.8645198 -2.1838509 -0.0545454
Clostridium_fessum|SGB4705 -1.6158900 0.0000443 0.0001577 -1.1006841 0.0048020 0.0398833 -1.1368423 0.0384349 0.2063733 -1.0645259 0.0521082 0.3314966 0.0723164 0.9252760 0.9929048 -1.8607983 -0.3405699 -2.2124133 -0.0612713 -2.1388820 0.0098303
GGB9707_SGB15229|SGB15229 0.6452823 0.0445093 0.0756152 1.0866666 0.0008197 0.0106555 0.9548967 0.0357238 0.2054116 1.2184364 0.0075553 0.1328848 0.2635397 0.6797760 0.9929048 0.4574080 1.7159252 0.0644880 1.8453055 0.3290334 2.1078395
Lachnospiraceae_bacterium_AM48_27BH|SGB4706 -1.1131285 0.0045403 0.0102071 1.2676366 0.0012809 0.0147307 1.6523047 0.0029449 0.0463439 0.8829685 0.1082011 0.4901839 -0.7693361 0.3214104 0.9724308 0.5039082 2.0313650 0.5716194 2.7329899 -0.1964961 1.9624331
GGB9509_SGB14906|SGB14906 1.0817985 0.0089485 0.0182013 -1.6543521 0.0000804 0.0023730 -1.5741131 0.0072217 0.0835575 -1.7345911 0.0031102 0.0961993 -0.1604780 0.8446550 0.9929048 -2.4617079 -0.8469964 -2.7165316 -0.4316946 -2.8757193 -0.5934630
GGB45432_SGB63101|SGB63101 0.7232262 0.0088808 0.0181874 -0.7958911 0.0040635 0.0368175 -0.8984331 0.0212966 0.1415041 -0.6933491 0.0742736 0.3998957 0.2050839 0.7077520 0.9929048 -1.3350948 -0.2566874 -1.6614131 -0.1354531 -1.4554673 0.0687691
GGB2653_SGB3574|SGB3574 0.0113250 0.9753715 0.9753715 -1.4691557 0.0000921 0.0023730 -1.8428969 0.0004942 0.0105548 -1.0954145 0.0358765 0.2681768 0.7474824 0.3090608 0.9724308 -2.1924343 -0.7458771 -2.8663451 -0.8194487 -2.1177067 -0.0731223
Phocea_massiliensis|SGB14837 0.4892877 0.1130213 0.1698159 -1.1101840 0.0004001 0.0058102 -1.0621671 0.0155770 0.1164377 -1.1582010 0.0083931 0.1348712 -0.0960339 0.8759362 0.9929048 -1.7165656 -0.5038025 -1.9202046 -0.2041296 -2.0152693 -0.3011327
Lachnospiraceae_bacterium|SGB4953 -1.0110542 0.0196470 0.0376567 1.8996297 0.0000175 0.0007650 2.0042382 0.0011843 0.0208299 1.7950211 0.0035391 0.0961993 -0.2092171 0.8076874 0.9929048 1.0523587 2.7469007 0.8053392 3.2031372 0.5974763 2.9925660
Holdemania_filiformis|SGB4046 -0.8713113 0.0335927 0.0590836 -1.2995237 0.0016731 0.0185275 -2.2053986 0.0001804 0.0079526 -0.3936488 0.4942922 0.8905572 1.8117499 0.0272454 0.5717111 -2.1023519 -0.4966955 -3.3414106 -1.0693867 -1.5283776 0.7410800
Anaeromassilibacillus_senegalensis|SGB14894 1.0587520 0.0067348 0.0142817 -1.2016868 0.0021720 0.0223943 -1.3537201 0.0141471 0.1114726 -1.0496536 0.0559301 0.3474891 0.3040665 0.6939733 0.9929048 -1.9633423 -0.4400314 -2.4314720 -0.2759681 -2.1261882 0.0268810
Ruminococcus_sp_AF41_9|SGB25497 -0.8299026 0.0411850 0.0707719 1.7828783 0.0000179 0.0007650 2.7832380 0.0000025 0.0001522 0.7825187 0.1716609 0.6497039 -2.0007193 0.0141307 0.5281335 0.9865895 2.5791671 1.6564794 3.9099965 -0.3429672 1.9080046
Eubacterium_ramulus|SGB4959 -2.2432627 0.0000000 0.0000003 -1.3014646 0.0010277 0.0124560 -1.5154793 0.0066098 0.0835575 -1.0874499 0.0497902 0.3258371 0.4280295 0.5832078 0.9929048 -2.0701289 -0.5328003 -2.6031489 -0.4278097 -2.1738909 -0.0010088
Youxingia_wuxianensis|SGB82503 0.2836651 0.4199975 0.5187690 -1.0654123 0.0027893 0.0278003 -1.7207509 0.0006765 0.0126413 -0.4100738 0.4095044 0.8055383 1.3106771 0.0635956 0.7486670 -1.7582618 -0.3725628 -2.7011416 -0.7403602 -1.3893571 0.5692096
Blautia_massiliensis|SGB4826 -3.9054717 0.0000000 0.0000000 -1.7487130 0.0019052 0.0203449 -1.6636803 0.0353443 0.2054116 -1.8337457 0.0204108 0.2087421 -0.1700654 0.8782097 0.9929048 -2.8427577 -0.6546682 -3.2117672 -0.1155933 -3.3800841 -0.2874073
GGB9765_SGB15382|SGB15382 0.9724670 0.0052880 0.0115410 -1.0403415 0.0028965 0.0279370 -0.6868283 0.1601205 0.4949617 -1.3938547 0.0046947 0.1169756 -0.7070263 0.3055987 0.9724308 -1.7195795 -0.3611035 -1.6479586 0.2743020 -2.3538993 -0.4338100
GGB9770_SGB15390|SGB15390 1.1393138 0.0031158 0.0073356 -1.1023969 0.0042003 0.0369375 -1.1048957 0.0413047 0.2093238 -1.0998980 0.0419905 0.2853448 0.0049977 0.9947562 0.9950745 -1.8520565 -0.3527372 -2.1656735 -0.0441180 -2.1594776 -0.0403184
Ruminococcus_torques|SGB4608 -2.5450549 0.0000002 0.0000009 -2.9967166 0.0000000 0.0000002 -4.0037248 0.0000000 0.0000008 -1.9897083 0.0028797 0.0961993 2.0140166 0.0318587 0.5717111 -3.9152587 -2.0781744 -5.3034734 -2.7039763 -3.2879887 -0.6914278
Agathobaculum_butyriciproducens|SGB14991 -0.2799754 0.5736608 0.6497143 1.8298077 0.0003116 0.0049041 1.7918165 0.0116982 0.1114726 1.8677989 0.0085704 0.1348712 0.0759824 0.9391106 0.9929048 0.8492138 2.8104015 0.4042640 3.1793691 0.4818136 3.2537841
Clostridium_SGB48024|SGB48024 0.3495793 0.4685239 0.5603545 1.8824245 0.0001341 0.0027026 2.0858928 0.0025593 0.0425134 1.6789563 0.0145882 0.1671720 -0.4069364 0.6729259 0.9929048 0.9323470 2.8325021 0.7415212 3.4302643 0.3361033 3.0218094
GGB9531_SGB14932|SGB14932 0.6741149 0.0481296 0.0812355 -1.2220355 0.0004081 0.0058102 -1.0267774 0.0335676 0.2007343 -1.4172937 0.0035162 0.0961993 -0.3905163 0.5648881 0.9929048 -1.8905380 -0.5535330 -1.9727167 -0.0808380 -2.3621646 -0.4724228
Oscillibacter_valericigenes|SGB15076 2.2693025 0.0000001 0.0000007 -1.5455775 0.0002310 0.0038371 -0.3939028 0.4984511 0.7655611 -2.6972522 0.0000068 0.0020399 -2.3033494 0.0056168 0.3358826 -2.3558148 -0.7353402 -1.5403986 0.7525931 -3.8424531 -1.5520513
GGB9616_SGB15052|SGB15052 0.1922882 0.6243555 0.6939863 1.5604516 0.0001032 0.0023730 1.3778329 0.0139910 0.1114726 1.7430703 0.0019661 0.0961993 0.3652374 0.6418749 0.9929048 0.7865111 2.3343920 0.2826975 2.4729683 0.6491719 2.8369687
GGB58158_SGB79798|SGB79798 -0.0395133 0.9036721 0.9285153 -0.9249820 0.0051278 0.0414382 -1.2410264 0.0078830 0.0858020 -0.6089377 0.1881529 0.6772633 0.6320887 0.3337315 0.9724308 -1.5687010 -0.2812631 -2.1518967 -0.3301560 -1.5187792 0.3009038
Enterocloster_hominis|SGB4721 -0.4611836 0.2485952 0.3190128 -1.6735473 0.0000436 0.0014488 -2.1154645 0.0002425 0.0080565 -1.2316301 0.0301117 0.2433353 0.8838345 0.2688329 0.9724308 -2.4600398 -0.8870548 -3.2283612 -1.0025678 -2.3432697 -0.1199904
Streptococcus_thermophilus|SGB8002 -1.9892533 0.0000000 0.0000001 -2.6523655 0.0000000 0.0000000 -4.2589263 0.0000000 0.0000000 -1.0458046 0.0256343 0.2433353 3.2131217 0.0000024 0.0007185 -3.3010343 -2.0036966 -5.1768008 -3.3410518 -1.9626424 -0.1289669
Blautia_obeum|SGB4810 -1.2227533 0.0022206 0.0055796 1.8263749 0.0000071 0.0005271 2.8583481 0.0000008 0.0000597 0.7944017 0.1549579 0.6177656 -2.0639464 0.0095315 0.4749853 1.0496396 2.6031103 1.7592579 3.9574383 -0.3034471 1.8922505
GGB51884_SGB49168|SGB49168 0.1489066 0.6616867 0.7247045 1.2130993 0.0004677 0.0063559 1.3067445 0.0072659 0.0835575 1.1194540 0.0209440 0.2087421 -0.1872905 0.7831313 0.9929048 0.5423556 1.8838430 0.3576339 2.2558552 0.1714154 2.0674927
Roseburia_lenta|SGB4957 -1.4433630 0.0000086 0.0000362 1.7438636 0.0000001 0.0000112 2.9096930 0.0000000 0.0000001 0.5780342 0.1946353 0.6772633 -2.3316588 0.0002817 0.0421119 1.1237786 2.3639486 2.0322650 3.7871210 -0.2984027 1.4544712
GGB3288_SGB4342|SGB4342 -0.3043170 0.4247255 0.5187690 1.4470612 0.0002004 0.0035245 1.9437150 0.0004045 0.0100784 0.9504073 0.0788960 0.4138580 -0.9933077 0.1933818 0.9594504 0.6961415 2.1979808 0.8811543 3.0062757 -0.1109532 2.0117678
GGB9574_SGB14987|SGB14987 0.6979519 0.0760440 0.1235716 1.1288124 0.0044098 0.0376722 0.8497545 0.1264137 0.4308016 1.4078702 0.0117570 0.1671720 0.5581157 0.4763145 0.9929048 0.3569149 1.9007099 -0.2424901 1.9419991 0.3168594 2.4988811
Clostridiaceae_unclassified_SGB4771|SGB4771 -0.0888094 0.8522123 0.8878449 2.1375079 0.0000134 0.0007650 1.8328628 0.0072095 0.0835575 2.4421530 0.0003793 0.0567018 0.6092902 0.5230257 0.9929048 1.1976428 3.0773730 0.5029420 3.1627836 1.1137344 3.7705716
Lachnospiraceae_bacterium_OM04_12BH|SGB4893 -0.6037329 0.1247910 0.1811286 1.6753589 0.0000317 0.0011836 2.1178834 0.0001862 0.0079526 1.2328344 0.0271771 0.2433353 -0.8850489 0.2597427 0.9724308 0.9026890 2.4480288 1.0245459 3.2112209 0.1407319 2.3249370
GGB9616_SGB15051|SGB15051 0.0491406 0.9109482 0.9296024 1.7151507 0.0001356 0.0027026 1.5312508 0.0146741 0.1125016 1.8990506 0.0025687 0.0961993 0.3677999 0.6756214 0.9929048 0.8488687 2.5814327 0.3054509 2.7570507 0.6746353 3.1234660
Open code

if(file.exists('gitignore/result_microbiome_SGB30.csv') == FALSE){
  write.table(result_microbiome, 
              'gitignore/result_microbiome_SGB30.csv', row.names = FALSE)
  }

5 Elastic net

To assess the predictive power of microbiome features to discriminate between diet strategy, we employed Elastic Net logistic regression.

As we expected very high level of co-linearity, we allowed \(alpha\) to rather small (0, 0.2 or 0.4). All features were standardized by 2 standard deviations.

The performance of the predictive models was evaluated through their capacity of discriminate between vegan and omnivore diets, using out-of-sample area under ROC curve (AUC; estimated with out-of-bag bootstrap) as the measure of discriminatory capacity.

All features were transformed by 2 standard deviations (resulting in standard deviation of 0.5).

5.1 Prepare data for glmnet

Open code
data_microbiome_glmnet <- data_microbiome_original %>%
  na.omit() %>%
  dplyr::mutate(
    vegan = as.numeric(
      dplyr::if_else(
        Diet == "VEGAN", 1, 0
      )
    ),
    dplyr::across(
      `Bacteroides_stercoris|SGB1830`:`Clostridiaceae_unclassified_SGB4771|SGB4771`, 
      ~ arm::rescale(.)
    )
  ) %>%
  dplyr::select(
    vegan,
    dplyr::everything()
  ) %>%
  dplyr::select(
    Sample, vegan, 
    `Bacteroides_stercoris|SGB1830`:`Clostridiaceae_unclassified_SGB4771|SGB4771`
  )

5.2 Fit model

Open code
modelac <- "elanet_microbiome_SGB30"

assign(
  modelac,
  run(
    expr = clust_glmnet(
      data = data_microbiome_glmnet,
      outcome = "vegan",
      clust_id = "Sample",
      sample_method = "oos_boot",
      N = 500,
      alphas = c(0, 0.2, 0.4),
      family = "binomial",
      seed = 478
    ),
    path = paste0("gitignore/run/", modelac)
  )
)

5.3 Model summary

Open code
elanet_microbiome_SGB30$model_summary
##   alpha     lambda       auc auc_OutOfSample auc_oos_CIL auc_oos_CIU  accuracy
## 1   0.2 0.05294768 0.9949201        0.887315   0.8047796   0.9523366 0.9698795
##   accuracy_OutOfSample accuracy_oos_CIL accuracy_oos_CIU
## 1            0.7973382        0.6949153        0.8833333

5.4 Calibration plot

Open code
elanet_microbiome_SGB30$plot

Calibration plot, showing outcomes value (y-axis) according to prediction of elastic net model (x-axis). Grey curves show predictions from different bootstrap iterations. Red lines and shadows show average prediction across all iterations and its 95% confidence

5.5 Estimated coefficients

Open code
data.frame(
  microbiome = row.names(
    elanet_microbiome_SGB30$betas
    )[
      which(
        abs(
          elanet_microbiome_SGB30$betas
          )>0
        )
      ],
  beta = elanet_microbiome_SGB30$betas[
    abs(
      elanet_microbiome_SGB30$betas
      )>0
    ]
  ) %>% 
  mutate(
    is_in_ExtValCoh = if_else(
      microbiome %in% names(data_microbiome_validation),
      1, 0
      )
    )
##                                           microbiome         beta
## 1                                        (Intercept)  0.511579554
## 2                       Alistipes_putredinis|SGB2318 -0.035673015
## 3                          GGB9602_SGB15031|SGB15031 -0.072598650
## 4                       Phocaeicola_vulgatus|SGB1814  0.055271158
## 5                  Sutterella_wadsworthensis|SGB9283  0.048565620
## 6                 Parabacteroides_distasonis|SGB1934  0.077620832
## 7                           GGB47687_SGB2286|SGB2286 -0.034436116
## 8                 Brotolimicola_acetigignens|SGB4914  0.014966224
## 9              Faecalibacterium_prausnitzii|SGB15339  0.095440885
## 10                Faecalibacterium_SGB15346|SGB15346  0.007147743
## 11                   Dialister_invisus|SGB5825_group  0.049768508
## 12                        GGB33469_SGB15236|SGB15236 -0.042846768
## 13                     Bacteroides_eggerthii|SGB1829 -0.176302045
## 14                         Escherichia_coli|SGB10068 -0.060276967
## 15                           GGB3175_SGB4191|SGB4191 -0.114603501
## 16   Oscillospiraceae_bacterium_CLA_AA_H250|SGB14861 -0.026277500
## 17                         GGB9715_SGB15267|SGB15267  0.027979659
## 18                 Lachnospira_pectinoschiza|SGB5075 -0.197650618
## 19                        Bacteroides_clarus|SGB1832 -0.283159299
## 20                    Anaerotignum_faecicola|SGB5190 -0.151027161
## 21  Hydrogenoanaerobacterium_saccharovorans|SGB15350 -0.216568806
## 22                        Bacteroides_faecis|SGB1860  0.047622592
## 23                         Roseburia_hominis|SGB4936  0.059440357
## 24                         GGB9712_SGB15244|SGB15244 -0.010773625
## 25                  Odoribacter_splanchnicus|SGB1790 -0.013240831
## 26                        Bacteroides_ovatus|SGB1871  0.189828975
## 27                         GGB9614_SGB15049|SGB15049  0.070090055
## 28              Barnesiella_intestinihominis|SGB1965 -0.081868455
## 29          Agathobaculum_butyriciproducens|SGB14993  0.095132263
## 30                           GGB6649_SGB9391|SGB9391  0.206425435
## 31                       Ruminococcus_bromii|SGB4285  0.074576850
## 32           Lawsonibacter_asaccharolyticus|SGB15154 -0.128272502
## 33                  Phocaeicola_massiliensis|SGB1812 -0.001813712
## 34                        Clostridium_fessum|SGB4705 -0.228912823
## 35                     Clostridium_sp_AF36_4|SGB4644  0.131716733
## 36                         GGB9627_SGB15081|SGB15081  0.026856354
## 37                         GGB9707_SGB15229|SGB15229  0.154701269
## 38                   Roseburia_inulinivorans|SGB4940 -0.028793525
## 39                         Phocaeicola_dorei|SGB1815 -0.055172313
## 40                         GGB9502_SGB14899|SGB14899 -0.081081523
## 41       Lachnospiraceae_bacterium_AM48_27BH|SGB4706  0.132798757
## 42                         GGB9619_SGB15067|SGB15067 -0.100259747
## 43                        GGB52130_SGB14966|SGB14966 -0.008609615
## 44                         Blautia_sp_MCC283|SGB4828  0.021484960
## 45                         GGB9509_SGB14906|SGB14906 -0.251684291
## 46                        GGB45432_SGB63101|SGB63101 -0.052225983
## 47                           GGB3653_SGB4964|SGB4964  0.034684642
## 48                           GGB2653_SGB3574|SGB3574 -0.174495506
## 49                        GGB33512_SGB15203|SGB15203 -0.009922730
## 50                 Lachnospiraceae_bacterium|SGB4781 -0.086928748
## 51                      Phocea_massiliensis|SGB14837 -0.151971553
## 52                         Guopingia_tenuis|SGB14127 -0.105963051
## 53                 Lachnospiraceae_bacterium|SGB4953  0.192602465
## 54                           Blautia_SGB4815|SGB4815 -0.126582192
## 55                     Holdemania_filiformis|SGB4046 -0.165864895
## 56       Anaeromassilibacillus_senegalensis|SGB14894 -0.103673015
## 57                 Lachnospira_pectinoschiza|SGB5089  0.080199944
## 58                   Ruminococcus_sp_AF41_9|SGB25497  0.297378148
## 59                       Eubacterium_ramulus|SGB4959 -0.218176897
## 60                  Enterocloster_lavalensis|SGB4725  0.137788476
## 61                      Roseburia_sp_AF02_12|SGB4938  0.045941095
## 62                    Youxingia_wuxianensis|SGB82503 -0.129585266
## 63                Intestinimonas_gabonensis|SGB79840  0.037774041
## 64                      Blautia_massiliensis|SGB4826 -0.022144258
## 65                         GGB9765_SGB15382|SGB15382 -0.183451046
## 66                         GGB9770_SGB15390|SGB15390 -0.022251917
## 67                   Clostridium_sp_AM49_4BH|SGB4652  0.007005792
## 68                      Coprobacter_secundus|SGB1962 -0.046341779
## 69                           GGB3510_SGB4687|SGB4687  0.129324333
## 70                 Mediterraneibacter_faecis|SGB4563 -0.215122996
## 71                      Ruminococcus_torques|SGB4608 -0.377079932
## 72                         Coprococcus_comes|SGB4577 -0.050397204
## 73                    Eubacterium_ventriosum|SGB5045  0.138418574
## 74                         GGB9758_SGB15368|SGB15368 -0.055516524
## 75                           GGB3304_SGB4367|SGB4367  0.052255707
## 76          Agathobaculum_butyriciproducens|SGB14991  0.168614919
## 77                         GGB9296_SGB14253|SGB14253  0.066107880
## 78                    Clostridium_sp_AF12_28|SGB4715 -0.003328052
## 79                     Clostridium_SGB48024|SGB48024  0.063351217
## 80             Slackia_isoflavoniconvertens|SGB14773  0.025434510
## 81               Anaerotruncus_massiliensis|SGB14965 -0.060526070
## 82                         GGB9531_SGB14932|SGB14932 -0.266885285
## 83              Oscillibacter_valericigenes|SGB15076 -0.241013258
## 84                       Anaerostipes_hadrus|SGB4547  0.029184007
## 85                         GGB9616_SGB15052|SGB15052  0.086803074
## 86                         GGB9524_SGB14924|SGB14924  0.055832021
## 87                           Segatella_copri|SGB1626  0.087698990
## 88         Mediterraneibacter_butyricigenes|SGB25493  0.095770164
## 89                  Eubacteriaceae_bacterium|SGB3958 -0.001895835
## 90                     Enterocloster_hominis|SGB4721 -0.228002405
## 91                              Blautia_luti|SGB4832  0.010659076
## 92                      Blautia_sp_OF03_15BH|SGB4779 -0.036148950
## 93                Streptococcus_thermophilus|SGB8002 -0.564128139
## 94                         Roseburia_hominis|SGB4659  0.032427335
## 95                       Evtepia_gabavorous|SGB15120  0.028849861
## 96                             Blautia_obeum|SGB4810  0.273885058
## 97                           GGB3480_SGB4648|SGB4648 -0.049527836
## 98                Anaerobutyricum_soehngenii|SGB4537  0.027348622
## 99                 Bacteroides_xylanisolvens|SGB1867  0.032899455
## 100                        GGB9708_SGB15233|SGB15233  0.127259201
## 101                       GGB51884_SGB49168|SGB49168  0.167279927
## 102                      Veillonella_atypica|SGB6936  0.001449529
## 103                    Enterocloster_bolteae|SGB4758  0.013936034
## 104                          Roseburia_lenta|SGB4957  0.334732607
## 105                          GGB3288_SGB4342|SGB4342  0.229783890
## 106                        GGB9574_SGB14987|SGB14987  0.253072850
## 107                 Eubacterium_sp_AF34_35BH|SGB5051 -0.048866235
## 108      Clostridiaceae_unclassified_SGB4771|SGB4771  0.264389510
##     is_in_ExtValCoh
## 1                 0
## 2                 1
## 3                 1
## 4                 1
## 5                 1
## 6                 1
## 7                 1
## 8                 1
## 9                 1
## 10                1
## 11                1
## 12                1
## 13                1
## 14                1
## 15                1
## 16                1
## 17                1
## 18                1
## 19                1
## 20                1
## 21                1
## 22                1
## 23                1
## 24                1
## 25                1
## 26                1
## 27                1
## 28                1
## 29                1
## 30                1
## 31                1
## 32                1
## 33                1
## 34                1
## 35                1
## 36                1
## 37                1
## 38                1
## 39                1
## 40                1
## 41                1
## 42                1
## 43                1
## 44                1
## 45                1
## 46                1
## 47                1
## 48                1
## 49                1
## 50                1
## 51                1
## 52                1
## 53                1
## 54                1
## 55                1
## 56                1
## 57                1
## 58                1
## 59                1
## 60                1
## 61                1
## 62                1
## 63                1
## 64                1
## 65                1
## 66                1
## 67                1
## 68                1
## 69                1
## 70                1
## 71                1
## 72                1
## 73                1
## 74                1
## 75                1
## 76                1
## 77                1
## 78                1
## 79                1
## 80                1
## 81                1
## 82                1
## 83                1
## 84                1
## 85                1
## 86                1
## 87                1
## 88                1
## 89                1
## 90                1
## 91                1
## 92                1
## 93                1
## 94                1
## 95                1
## 96                1
## 97                1
## 98                1
## 99                1
## 100               1
## 101               1
## 102               1
## 103               1
## 104               1
## 105               1
## 106               1
## 107               1
## 108               1

5.6 Plot beta coefficients

Open code
elacoef <- data.frame(
  microbiome = row.names(elanet_microbiome_SGB30$betas),
  beta_ela = elanet_microbiome_SGB30$betas[, 1]
) %>%
  arrange(abs(beta_ela)) %>%
  filter(abs(beta_ela) > 0,
         !grepl('Intercept', microbiome)) %>%
  mutate(microbiome = factor(microbiome, levels = microbiome)) 

plotac <- "elanet_beta_microbiome_SGB30"
path <- "gitignore/figures"

assign(plotac, 
  ggplot(elacoef,
    aes(
      x = microbiome,
      y = beta_ela
    )
  ) +
  geom_point() +
  geom_hline(yintercept = 0, color = "black") +
  labs(
    y = "Standardized beta coefficients",
    x = "Bacteria species"
  ) +
  theme_minimal() +
  coord_flip() + 
  theme(
    axis.text.x = element_text(size = 10),
    axis.text.y = element_text(size = 10),
    axis.title.x = element_text(size = 12),
    axis.title.y = element_text(size = 12),
    legend.position = "bottom"
  )
)

get(plotac)

Regression coefficients from the elastic net model predicting vegan diet strategy based on CLR-transformed and standardized proportion of bacterial taxa. Taxa are ordered by the magnitude of the standardized coefficients, indicating their relative importance in distinguishing between the diet groups. The sign of each coefficient indicates the direction of association with vegan diet status, with positive values indicating a higher likelihood of vegan status and negative values indicating omnivore status. Taxa whose effects were shrunk to zero are not shown.
Open code

if (file.exists(paste0(path, "/", plotac, ".svg")) == FALSE) {  
  ggsave(
    path = paste0(path),
    filename = plotac,
    device = "svg",
    width = 7,
    height = 14
  )
}

6 External validation

External validation was performed with an independent Czech cohort.

As a first step, we will use the previously developed and internally validated elastic net model to predict vegan status in the independent Czech cohort. The validation data will be standardized using the mean and standard deviation of each taxa as taken from the training cohort to ensure comparability across datasets. For each subject in the external validation cohort, we will estimate the predicted probability of being vegan using the elastic net model. This predicted probability will then be used as a variable to discriminate between the diet groups in the independent cohort.

In a 2nd step, we will look at taxa that significantly differed between diet groups (average vegan diet effect across both countries, FDR < 0.05) estimated by linear models (one per a taxa) with data of training cohort. Then we will fit linear models also for external validation cohort. Effect of vegan diet on these taxa will be shown along with 95% confidence interval for all cohorts: training Czech and Italian cohorts, but also in Czech independent (validating) cohort

6.1 Prediction of diet (elastic net)

6.1.1 Get table of weights, means and SDs

Open code

coefs_microbiome_all <- get_coef(
  original_data = data_analysis,
  glmnet_model = elanet_microbiome_SGB30)

coefs_microbiome_all
## # A tibble: 295 × 5
##    predictor                             beta_scaled beta_OrigScale   mean    SD
##    <chr>                                       <dbl>          <dbl>  <dbl> <dbl>
##  1 (Intercept)                                0.512          NA     NA     NA   
##  2 Bacteroides_stercoris|SGB1830              0               0      0.565  4.53
##  3 Alistipes_putredinis|SGB2318              -0.0357         -0.276  4.70   3.87
##  4 Candidatus_Cibionibacter_quicibialis…      0               0      5.11   2.15
##  5 Bacteroides_uniformis|SGB1836              0               0      5.96   2.01
##  6 Eubacterium_siraeum|SGB4198                0               0     -1.27   3.66
##  7 GGB9602_SGB15031|SGB15031                 -0.0726         -0.484  0.576  3.33
##  8 Phocaeicola_vulgatus|SGB1814               0.0553          0.291  5.48   2.63
##  9 Faecalibacterium_prausnitzii|SGB15316      0               0      5.08   2.09
## 10 Lachnospiraceae_bacterium_CLA_AA_H24…      0               0      3.30   1.72
## # ℹ 285 more rows

6.1.2 Identify shared and missing predictors

Open code

## Which are missing in the validation set
missing <- setdiff(
  coefs_microbiome_all$predictor[-1], 
  colnames(
    data_microbiome_validation
    )
  )

missing
## character(0)

## Which are common with the validations et
common_predictors <- intersect(
  coefs_microbiome_all$predictor,
  colnames(data_microbiome_validation)
)

6.1.3 Standardize data in validation set

Open code
data_microbiome_validation_pred_all <- data_microbiome_validation %>%
  dplyr::mutate(
    vegan = if_else(
      Diet == "VEGAN", 1, 0
    )
  ) %>%
  dplyr::select(
    vegan,
    dplyr::all_of(common_predictors)
  ) %>% 
  dplyr::mutate(
    across(
      .cols = -vegan,
      .fns = ~ . 
      - coefs_microbiome_all$mean[
        match(
          cur_column(), 
          coefs_microbiome_all$predictor
          )
        ]
      )
    ) %>% 
  dplyr::mutate(
    across(
      .cols = -vegan,
      .fns = ~ . 
      / coefs_microbiome_all$SD[
        match(
          cur_column(), 
          coefs_microbiome_all$predictor
          )
        ]
      )
    ) 

6.1.4 Result

Open code
elanet_microbiome_SGB30$fit
## 
## Call:  glmnet::glmnet(x = original_predictors, y = original_outcome,      family = family, alpha = optim_par$alpha[1], lambda = optim_par$lamb_1se[1],      standardize = standardize) 
## 
##    Df %Dev  Lambda
## 1 107   60 0.05295
ggroc
## function (data, ...) 
## {
##     UseMethod("ggroc")
## }
## <bytecode: 0x57126c867c08>
## <environment: namespace:pROC>
newx <- as.matrix(data_microbiome_validation_pred_all[,-1])

predicted <- predict(
  elanet_microbiome_SGB30$fit, 
  newx = newx)

tr <- data_microbiome_validation_pred_all %>% 
  dplyr::mutate(
    predicted_logit = as.numeric(
      predict(
        elanet_microbiome_SGB30$fit,
        newx = newx
        )
      )
    ) %>% 
  dplyr::mutate(
    predicted = inv_logit(predicted_logit)
  )

roc_microbiome_all <- pROC::roc(
      vegan ~ predicted_logit,
      data = tr,
      direction = "<",
      levels = c(0, 1),
      ci = TRUE
      )

roc_microbiome_all
## 
## Call:
## roc.formula(formula = vegan ~ predicted_logit, data = tr, direction = "<",     levels = c(0, 1), ci = TRUE)
## 
## Data: predicted_logit in 43 controls (vegan 0) < 59 cases (vegan 1).
## Area under the curve: 0.8723
## 95% CI: 0.795-0.9496 (DeLong)

plotac <- "roc_microbiom_SGB30"
path <- "gitignore/figures"

assign(plotac, ggroc(roc_microbiome_all))
get(plotac)

Receiver Operating Characteristic (ROC) curve illustrating the model’s ability to distinguish between vegan and omnivore status based on CLR-transformed bacterial taxa abundances in the external validation Czech cohort. The curve plots the true positive rate (sensitivity) against the true positive rate (specificity) at various thresholds of predicted vegan status, as estimated from the elastic net model developed on the training data. The area under the curve (AUC) represents the model’s overall performance, with values closer to 1 indicating stronger discrimination.
Open code

if (file.exists(paste0(path, "/", plotac, ".svg")) == FALSE) {  
  ggsave(
    path = paste0(path),
    filename = plotac,
    device = "svg",
    width = 6,
    height = 4.5
  )
}

6.2 Diet effect across datasets

Similarly as in training data cohorts, we will fit linear model per each of the selected taxa (\(CLR\) - transformed), with a single fixed effect factor of diet.

6.2.1 Linear models in validation cohort

Open code

diet_sensitive_taxa <- result_microbiome %>% 
  filter(fdr_VGdiet_avg < 0.05) %>% 
  pull(outcome)

len <- length(diet_sensitive_taxa)

data_analysis_microbiome <- data_microbiome_validation %>%
  dplyr::mutate(
    Diet_VEGAN = as.numeric(
      dplyr::if_else(
        Diet == 'VEGAN', 1, 0
      )
    )
  ) %>%
  dplyr::select(
    Diet_VEGAN,
    all_of(diet_sensitive_taxa)
  )

6.2.1.1 Define number of microbiome and covariates

Open code
n_covarites <- 1
n_features <- ncol(data_analysis_microbiome) - n_covarites

6.2.1.2 Create empty objects

Open code
outcome <- vector('double', n_features)
logFD_VGdiet <- vector('double', n_features)
P_VGdiet <- vector('double', n_features)
CI_L_VGdiet <- vector('double', n_features)
CI_U_VGdiet <- vector('double', n_features)

6.2.1.3 Estimate over outcomes

Open code
for (i in 1:n_features) {
  ## define variable
  data_analysis_microbiome$outcome <- data_analysis_microbiome[, (i + n_covarites)]

  ## fit model
  model <- lm(outcome ~ Diet_VEGAN, data = data_analysis_microbiome)

  ## save results
  outcome[i] <- names(data_analysis_microbiome)[i + n_covarites]

  ## diet effect
  tr <- confint(model)

  CI_L_VGdiet[i] <- tr[which(row.names(tr) == "Diet_VEGAN"), ][1]
  CI_U_VGdiet[i] <- tr[which(row.names(tr) == "Diet_VEGAN"), ][2]

  logFD_VGdiet[i] <- summary(model)$coefficients[
    which(
      names(model$coefficients) == "Diet_VEGAN"
    ), 1
  ]

  P_VGdiet[i] <- summary(model)$coefficients[
    which(
      names(model$coefficients) == "Diet_VEGAN"
    ), 4
  ]
}

6.2.1.4 Results table

Open code
result_microbiome_val <- data.frame(
  outcome,
  logFD_VGdiet, P_VGdiet,
  CI_L_VGdiet, CI_U_VGdiet
)

kableExtra::kable(result_microbiome_val,
  caption = "Results of linear models estimating the effect of diet on CLR-trasformed taxa proportions. Only bacteria that significantly differed between diet groups in training cohorts (FDR < 0.05, average effect across both training cohorts) were included. `logFD` represents the estimated effects (regression coefficient), indicating how much the CLR-transformed taxa count differ between vegans and omnivores. `P`: p-value, `fdr`: p-value adjusted for multiple comparisons, and `CI_L` and `CI_U` represent the lower and upper bounds of the 95% confidence interval, respectively. All estimates in a single row are based on a single model."
)
Results of linear models estimating the effect of diet on CLR-trasformed taxa proportions. Only bacteria that significantly differed between diet groups in training cohorts (FDR < 0.05, average effect across both training cohorts) were included. logFD represents the estimated effects (regression coefficient), indicating how much the CLR-transformed taxa count differ between vegans and omnivores. P: p-value, fdr: p-value adjusted for multiple comparisons, and CI_L and CI_U represent the lower and upper bounds of the 95% confidence interval, respectively. All estimates in a single row are based on a single model.
outcome logFD_VGdiet P_VGdiet CI_L_VGdiet CI_U_VGdiet
Escherichia_coli|SGB10068 -0.0678001 0.8780534 -0.9422270 0.8066267
Bacteroides_clarus|SGB1832 -0.4474756 0.1501145 -1.0596258 0.1646745
Hydrogenoanaerobacterium_saccharovorans|SGB15350 0.0343774 0.8840093 -0.4319367 0.5006916
GGB6649_SGB9391|SGB9391 0.6707753 0.0619526 -0.0341736 1.3757242
Ruthenibacterium_lactatiformans|SGB15271 -0.5139319 0.1557949 -1.2268892 0.1990254
Lawsonibacter_asaccharolyticus|SGB15154 -2.0366512 0.0000022 -2.8405092 -1.2327932
Clostridium_fessum|SGB4705 0.0036823 0.9877139 -0.4695508 0.4769153
GGB9707_SGB15229|SGB15229 -0.2846691 0.5637032 -1.2596309 0.6902926
Lachnospiraceae_bacterium_AM48_27BH|SGB4706 0.2944198 0.1109060 -0.0687547 0.6575944
GGB9509_SGB14906|SGB14906 -0.5158608 0.1075919 -1.1461894 0.1144678
GGB45432_SGB63101|SGB63101 -0.3814408 0.3381234 -1.1676982 0.4048166
GGB2653_SGB3574|SGB3574 -0.7110515 0.0637825 -1.4636353 0.0415322
Phocea_massiliensis|SGB14837 -0.2249475 0.5453208 -0.9603579 0.5104629
Lachnospiraceae_bacterium|SGB4953 1.1093932 0.1054861 -0.2379349 2.4567214
Holdemania_filiformis|SGB4046 -0.7909502 0.0051249 -1.3392099 -0.2426906
Anaeromassilibacillus_senegalensis|SGB14894 0.1582383 0.6873362 -0.6195266 0.9360031
Ruminococcus_sp_AF41_9|SGB25497 1.5723968 0.0046260 0.4956389 2.6491548
Eubacterium_ramulus|SGB4959 -0.7821065 0.0023325 -1.2787575 -0.2854555
Youxingia_wuxianensis|SGB82503 -0.7728454 0.0336176 -1.4845820 -0.0611087
Blautia_massiliensis|SGB4826 -0.3538921 0.3462847 -1.0958595 0.3880752
GGB9765_SGB15382|SGB15382 -0.3380327 0.4376177 -1.1985609 0.5224954
GGB9770_SGB15390|SGB15390 -0.2149847 0.5654590 -0.9546180 0.5246486
Ruminococcus_torques|SGB4608 -1.9375993 0.0002924 -2.9616396 -0.9135591
Agathobaculum_butyriciproducens|SGB14991 0.4691300 0.2780519 -0.3842585 1.3225185
Clostridium_SGB48024|SGB48024 0.4767405 0.2762444 -0.3872161 1.3406970
GGB9531_SGB14932|SGB14932 -0.6312664 0.0872877 -1.3565383 0.0940056
Oscillibacter_valericigenes|SGB15076 -0.2233062 0.4190206 -0.7692669 0.3226545
GGB9616_SGB15052|SGB15052 0.9145023 0.0161248 0.1732221 1.6557826
GGB58158_SGB79798|SGB79798 -0.5931442 0.1208917 -1.3453877 0.1590993
Enterocloster_hominis|SGB4721 -2.5838231 0.0000001 -3.4848975 -1.6827488
Streptococcus_thermophilus|SGB8002 -1.9051306 0.0000122 -2.7259594 -1.0843017
Blautia_obeum|SGB4810 2.0476532 0.0000342 1.1115125 2.9837939
GGB51884_SGB49168|SGB49168 0.1576822 0.5954702 -0.4296599 0.7450243
Roseburia_lenta|SGB4957 1.9623313 0.0003619 0.9080098 3.0166528
GGB3288_SGB4342|SGB4342 0.1719754 0.6886263 -0.6770096 1.0209604
GGB9574_SGB14987|SGB14987 0.8228106 0.0037310 0.2731300 1.3724911
Clostridiaceae_unclassified_SGB4771|SGB4771 0.3951355 0.3029642 -0.3619755 1.1522465
Lachnospiraceae_bacterium_OM04_12BH|SGB4893 -0.3275102 0.5099089 -1.3100021 0.6549817
GGB9616_SGB15051|SGB15051 0.5125892 0.1770983 -0.2355437 1.2607221
Open code

if (file.exists("gitignore/result_microbiom_validation_SGB30.csv") == FALSE) {
  write.table(result_microbiome_val,
    "gitignore/result_microbiom_validation_SGB30.csv",
    row.names = FALSE
  )
}

6.2.2 Forest plot

6.2.2.1 Prepare data

Open code

## subset result tables
result_microbiome_subset <- result_microbiome %>%
  filter(outcome %in% diet_sensitive_taxa)

result_microbiome_val_subset <- result_microbiome_val %>%
  filter(outcome %in% diet_sensitive_taxa)

## create a data frame
data_forest <- data.frame(
  outcome = rep(diet_sensitive_taxa, 3),
  beta = c(
    result_microbiome_subset$logFD_VGdiet_inCZ,
    result_microbiome_subset$logFD_VGdiet_inIT,
    result_microbiome_val_subset$logFD_VGdiet
  ),
  lower = c(
    result_microbiome_subset$CI_L_VGdiet_inCZ,
    result_microbiome_subset$CI_L_VGdiet_inIT,
    result_microbiome_val_subset$CI_L_VGdiet
  ),
  upper = c(
    result_microbiome_subset$CI_U_VGdiet_inCZ,
    result_microbiome_subset$CI_U_VGdiet_inIT,
    result_microbiome_val_subset$CI_U_VGdiet
  ),
  dataset = c(
    rep("CZ", len),
    rep("IT", len),
    rep("Validation", len)
  )
)

## define ordering
validation_order <- data_forest %>%
  filter(dataset == "Validation") %>%
  arrange(beta) %>%
  pull(outcome)

## Define 'winners'
up_winners <- data_forest %>% 
  pivot_wider(names_from = dataset,
              values_from = c(beta, lower, upper)) %>% 
  left_join(elacoef %>% mutate(outcome = microbiome) %>% select(-microbiome), 
            by = 'outcome') %>% 
  filter(beta_CZ > 0,
         beta_IT > 0,
         lower_Validation > 0,
         beta_ela > 0.1) %>% 
  pull(outcome)

down_winners <- data_forest %>% 
  pivot_wider(names_from = dataset,
              values_from = c(beta, lower, upper)) %>% 
  left_join(elacoef %>% mutate(outcome = microbiome) %>% select(-microbiome), 
            by = 'outcome') %>% 
  filter(beta_CZ < 0,
         beta_IT < 0,
         upper_Validation < 0,
         beta_ela < -0.1) %>% 
  pull(outcome)

winners <- c(up_winners, down_winners)

 data_forest <- data_forest %>%
  mutate(in_winner = if_else(outcome %in% winners, TRUE, FALSE, missing = FALSE)) %>%
  left_join(
    elacoef %>% mutate(outcome = microbiome) %>% select(-microbiome), 
    by = 'outcome') %>% 
   mutate(outcome = factor(outcome, levels = validation_order))

6.2.2.2 Plotting

Open code
plotac <- "forest_plot_microbiome_SGB30"
path <- "gitignore/figures"

colors <- c("CZ" = "#150999", "IT" = "#329243", "Validation" = "grey60")

assign(plotac, ggplot(
  data_forest, aes(x = outcome, y = beta, ymin = lower, ymax = upper, color = dataset)
) +
  geom_pointrange(position = position_dodge(width = 0.5), size = 0.5) +
  geom_hline(yintercept = 0, color = "black") +
  geom_errorbar(position = position_dodge(width = 0.5), width = 0.2) +
  scale_color_manual(values = colors) +
  labs(
    y = "Effect of vegan diet on CLR-transformed taxa proportion",
    x = "Outcome",
    color = "Dataset"
  ) +
  theme_minimal() +
  coord_flip() + # Flip coordinates to have outcomes on the y-axis
  scale_x_discrete(
    labels = setNames(
      ifelse(data_forest$in_winner, 
             paste0("**", data_forest$outcome, "**"), 
             as.character(data_forest$outcome)
      ), data_forest$outcome
    )
  ) +
  theme(
    axis.text.x = element_text(size = 10),
    axis.text.y = ggtext::element_markdown(size = 10),  
    axis.title.x = element_text(size = 12),
    axis.title.y = element_text(size = 12),
    legend.position = "bottom"
  )
)

get(plotac)

The forest plot illustrates the effects of a vegan diet on the CLR-transformed proportion of selected bacteria sample, along with their 95% confidence intervals, across two training cohorts (Czech and Italian) and one independent Czech cohort (validation). Green, blue, and grey points/lines represent differences in CLR-transformed taxa proportions between vegans and omnivores within the Italian cohort, Czech cohort, and Czech validation cohort, respectively.. Positive values suggest a higher count in vegans compared to omnivores. Only bacteria that showed significant differences between vegan and omnivorous diets (as an average effect across both training cohorts) were selected, and these effects were further validated in the independent cohort. The estimates for the training cohorts were obtained from a single linear model that included Diet, Country, and the interaction term Diet:Country as predictors. In the independent Czech validation cohort, Diet was the only fixed-effect predictor. Taxa validated in the linear model and showing predictive power in the elastic net model (|β| > 0.1) are bold
Open code

if (file.exists(paste0(path, "/", plotac, ".svg")) == FALSE) {  
  ggsave(
    path = paste0(path),
    filename = plotac,
    device = "svg",
    width = 9,
    height = 13
  )
}

6.2.3 Boxplot

Open code
plotac <- "boxplot_microbiome_SGB30"
path <- "gitignore/figures"

colo <- c('#F9FFAF','#329243')

boxplot_cond <- function(variable) {
  
  p <- ggboxplot(data_merged, 
                 x = 'Diet', 
                 y = variable, 
                 fill = 'Diet', 
                 tip.length = 0.15,
                 palette = colo,
                 outlier.shape = 1,
                 lwd = 0.25,
                 outlier.size = 0.8,
                 facet.by = 'Data',
                 title = variable,
                 ylab = 'CLR(taxa proportion)') +
    
    theme(
      plot.title = element_text(size = 10), 
      axis.title = element_text(size = 8),  
      axis.text.y = element_text(size = 7),
      axis.text.x = element_blank(),
      axis.title.x = element_blank()
    ) 
  return(p)
}

# Plot all outcomes
plots <- map(diet_sensitive_taxa, boxplot_cond)

# Create a matrix of plots
assign(plotac, 
       ggarrange(plotlist = plots, ncol = 3, nrow = 3,  common.legend = TRUE)
       )

get(plotac)
## $`1`

CLR-transformed taxa proportion across all 3 cohorts (Czech and Italian training cohorts and an independent Czech valdiation cohort) and across dietary groups
## 
## $`2`

CLR-transformed taxa proportion across all 3 cohorts (Czech and Italian training cohorts and an independent Czech valdiation cohort) and across dietary groups
## 
## $`3`

CLR-transformed taxa proportion across all 3 cohorts (Czech and Italian training cohorts and an independent Czech valdiation cohort) and across dietary groups
## 
## $`4`

CLR-transformed taxa proportion across all 3 cohorts (Czech and Italian training cohorts and an independent Czech valdiation cohort) and across dietary groups
## 
## $`5`

CLR-transformed taxa proportion across all 3 cohorts (Czech and Italian training cohorts and an independent Czech valdiation cohort) and across dietary groups
## 
## attr(,"class")
## [1] "list"      "ggarrange"

if (file.exists(paste0(path, "/", plotac, ".svg")) == FALSE) {  
  ggsave(
    path = paste0(path),
    filename = plotac,
    device = "svg",
    width = 7,
    height = 7
  )
}

7 Reproducibility

Open code
sessionInfo()
## R version 4.4.3 (2025-02-28)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 22.04.5 LTS
## 
## Matrix products: default
## BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.10.0 
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.10.0
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=cs_CZ.UTF-8        LC_COLLATE=en_US.UTF-8    
##  [5] LC_MONETARY=cs_CZ.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=cs_CZ.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=cs_CZ.UTF-8 LC_IDENTIFICATION=C       
## 
## time zone: Europe/Prague
## tzcode source: system (glibc)
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
##  [1] vegan_2.6-6.1         lattice_0.22-5        permute_0.9-7        
##  [4] zCompositions_1.5.0-4 truncnorm_1.0-8       NADA_1.6-1.1         
##  [7] survival_3.7-0        MicrobiomeStat_1.2    glmnet_4.1-8         
## [10] pROC_1.18.0           arm_1.12-2            lme4_1.1-35.5        
## [13] Matrix_1.7-0          MASS_7.3-65           car_3.1-2            
## [16] carData_3.0-5         emmeans_1.10.4        brms_2.21.0          
## [19] Rcpp_1.0.13           rms_6.8-1             Hmisc_5.1-3          
## [22] glmmTMB_1.1.9         ggtext_0.1.2          ggdist_3.3.2         
## [25] cowplot_1.1.1         ggpubr_0.4.0          sjPlot_2.8.16        
## [28] kableExtra_1.4.0      flextable_0.9.6       gtsummary_2.0.2      
## [31] compositions_2.0-8    janitor_2.2.0         stringi_1.7.6        
## [34] lubridate_1.8.0       forcats_1.0.0         stringr_1.5.1        
## [37] dplyr_1.1.4           purrr_1.0.2           readr_2.1.2          
## [40] tidyr_1.3.1           tibble_3.2.1          ggplot2_3.5.1        
## [43] tidyverse_1.3.1       readxl_1.3.1          openxlsx_4.2.8       
## [46] RJDBC_0.2-10          rJava_1.0-6           DBI_1.1.2            
## 
## loaded via a namespace (and not attached):
##   [1] fs_1.6.4                matrixStats_1.3.0       httr_1.4.2             
##   [4] insight_0.20.2          numDeriv_2016.8-1.1     tools_4.4.3            
##   [7] backports_1.5.0         sjlabelled_1.2.0        utf8_1.2.4             
##  [10] R6_2.5.1                mgcv_1.9-1              withr_3.0.1            
##  [13] Brobdingnag_1.2-7       prettyunits_1.1.1       gridExtra_2.3          
##  [16] bayesm_3.1-6            quantreg_5.98           cli_3.6.3              
##  [19] textshaping_0.3.6       performance_0.12.2      officer_0.6.6          
##  [22] sandwich_3.0-1          labeling_0.4.2          mvtnorm_1.1-3          
##  [25] robustbase_0.93-9       polspline_1.1.25        ggridges_0.5.3         
##  [28] askpass_1.1             QuickJSR_1.3.1          commonmark_1.9.1       
##  [31] systemfonts_1.0.4       StanHeaders_2.32.10     foreign_0.8-88         
##  [34] gfonts_0.2.0            svglite_2.1.3           rstudioapi_0.16.0      
##  [37] httpcode_0.3.0          generics_0.1.3          shape_1.4.6            
##  [40] distributional_0.4.0    zip_2.2.0               inline_0.3.19          
##  [43] loo_2.4.1               fansi_1.0.6             abind_1.4-5            
##  [46] lifecycle_1.0.4         multcomp_1.4-18         yaml_2.3.5             
##  [49] snakecase_0.11.1        grid_4.4.3              promises_1.2.0.1       
##  [52] crayon_1.5.0            haven_2.4.3             pillar_1.9.0           
##  [55] knitr_1.48              statip_0.2.3            boot_1.3-31            
##  [58] estimability_1.5.1      codetools_0.2-19        glue_1.7.0             
##  [61] V8_4.4.2                fontLiberation_0.1.0    data.table_1.15.4      
##  [64] vctrs_0.6.5             cellranger_1.1.0        gtable_0.3.0           
##  [67] assertthat_0.2.1        datawizard_0.12.2       xfun_0.46              
##  [70] mime_0.12               coda_0.19-4             modeest_2.4.0          
##  [73] timeDate_3043.102       iterators_1.0.14        statmod_1.4.36         
##  [76] ellipsis_0.3.2          TH.data_1.1-0           nlme_3.1-167           
##  [79] fontquiver_0.2.1        rstan_2.32.6            fBasics_4041.97        
##  [82] tensorA_0.36.2.1        TMB_1.9.14              rpart_4.1.24           
##  [85] colorspace_2.0-2        nnet_7.3-20             tidyselect_1.2.1       
##  [88] processx_3.8.4          timeSeries_4032.109     compiler_4.4.3         
##  [91] curl_4.3.2              rvest_1.0.2             htmlTable_2.4.0        
##  [94] SparseM_1.81            xml2_1.3.3              fontBitstreamVera_0.1.1
##  [97] posterior_1.6.0         checkmate_2.3.2         scales_1.3.0           
## [100] DEoptimR_1.0-10         callr_3.7.6             spatial_7.3-15         
## [103] digest_0.6.37           minqa_1.2.4             rmarkdown_2.27         
## [106] htmltools_0.5.8.1       pkgconfig_2.0.3         base64enc_0.1-3        
## [109] stabledist_0.7-2        dbplyr_2.1.1            fastmap_1.2.0          
## [112] rlang_1.1.4             htmlwidgets_1.6.4       shiny_1.9.1            
## [115] farver_2.1.0            zoo_1.8-9               jsonlite_1.8.8         
## [118] magrittr_2.0.3          Formula_1.2-4           bayesplot_1.8.1        
## [121] munsell_0.5.0           gdtools_0.3.7           stable_1.1.6           
## [124] plyr_1.8.6              pkgbuild_1.3.1          parallel_4.4.3         
## [127] ggrepel_0.9.5           sjmisc_2.8.10           ggeffects_1.7.0        
## [130] splines_4.4.3           gridtext_0.1.5          hms_1.1.1              
## [133] sjstats_0.19.0          ps_1.7.7                uuid_1.0-3             
## [136] markdown_1.13           ggsignif_0.6.3          stats4_4.4.3           
## [139] rmutil_1.1.10           rstantools_2.1.1        crul_1.5.0             
## [142] reprex_2.0.1            evaluate_1.0.0          RcppParallel_5.1.8     
## [145] modelr_0.1.8            nloptr_2.0.0            tzdb_0.2.0             
## [148] foreach_1.5.2           httpuv_1.6.5            MatrixModels_0.5-3     
## [151] openssl_1.4.6           clue_0.3-65             broom_1.0.6            
## [154] xtable_1.8-4            rstatix_0.7.0           later_1.3.0            
## [157] viridisLite_0.4.0       ragg_1.2.1              lmerTest_3.1-3         
## [160] cluster_2.1.8.1         bridgesampling_1.1-2

References

Lenth, Russell V. 2024. “Emmeans: Estimated Marginal Means, Aka Least-Squares Means.” https://CRAN.R-project.org/package=emmeans.